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A68001 1624 bp DNA PAT 

Sequence 1 from 
A68001 

A68001.1 GI:4756806 



unidentified . 

unidentified 

unclassified. 

1 (bases 1 to 1624) 

Bowles,D.J., O ' donnell , P . J. , Roberts , M . R . 
USE OF A NOVEL GLUCOSYL TRANSFERASE 
Patent: WO 9745546-A 1 04-DEC-1997; 
UNIV YORK (GB) 

Location/Qualifiers 

1. .1624 

/organism^ "unidentified" 
/ db_xr e f = " t axon : 3 2 6 4 4 " 
545 a 252 c 372 g 455 t 



05-MAY-1999 



and Calvert, CM. 



Query Match 31.3%; 
Best Local Similarity 61.9%; 
Matches 942; Conservative 



Score 54 7.8; DB 9 
Pred. No. 4.5e-119 
0; Mismatches 542 



Length 1624; 
Indels 39; Gaps 



Qy 87 tcacattgccttatttccagttatggctcatggtcacatgatcccaatgttggacatggc 146 

III II II! Mill I I Mill Mill Mill II I I IMIIMI 

Db 1 TCATTTTTTCTTCTTTCCCGATGATGCTCAAGGTCATATGATACCTACACTTGACATGGC 60 
Qy 147 caagctctttacctcaagaggcatacaaacaacaatcatttcgactctcgcc 198 

II I I I I I M I II I II II M I II I I 

Db 61 GAACGTTGTCGCTTGTCGTGGTGTTAAAGCCACTATAATCACAACACCTCTCAATGAATC 12 0 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



199 



121 



ttcgctgatccgataaacaaagctcgtgattcgggcctcgatattggactaagcat 2 54 

III II I III III Ml II I II MM I I I 

TGTTTTCTCTAAAGCTATTGAGAGAAACAAGCATTTAGGTATTGAAATTGATATTCGTTT 180 



2 55 cctcaaattcccaccagaaggatcaggaataccagatcacatggtgagccttgatctagt 314 

II MINIM III I I II II I III IMIIMI II 

181 ACTAAAATTCCCAGCTAAGGAGAATGATTTGCCTGAAGATTGTGAGCGTCTTGATCTTGT 24 0 

315 tactgaagattggctcccaaagtttgttgagtcattagtcttattacaagagccagt 371 

I MM II Mill II I I I III MM I I 

241 ACCTTCTGATGACAAACTCCCAAACTTCTTAAAAGCTGCGGCTATGATGAAAGATGAATT 3 00 

372 tgagaagcttatcgaagaactaaagctcgactgtctcgtttccgacatgttcttgccttg 431 

MM Mill I MM I II Mill Mill II III I II 1 1 

3 01 TGAGGAGCTTATTGGAGAATGTCGCCCTGATTGTCTTGTTTCTGATATGTTCCTTCCATG 3 60 
43 2 gacagtcgattgtgcggctaagttcggtattccgaggttggttttccacggaacgagcaa 4 91 

III III MM II I! II I II II II I II Mill Mill II I 

3 61 GACTACTGATAGTGCAGCCAAATTTAGCATACCAAGAATTGTATTCCATGGAACTAGTTA 42 0 



Qy 



4 92 ctttgcgttgtgtgcttcggagcaaatgaagcttcacaagccttataagaatgtaacttc 551 



Db 



421 CTTTGCCCTTTGTGTTGGCGATACGATCAGGCGTAATAAGCCTTTCAAGAATGTGTCATC 4 8 0 



Qy 552 tgatactgagacatttgttataccggatttcccgcatgagctgaagtttgtgaggactca 611 

Illlllll II MINI MINIUM II Mill I I I I MINI 

Db 4 81 GGATACTGAAACTTTTGTTGTACCGGATTTGCCACATGAAATTAGGCTAACTAGAACACA 540 
Qy 612 agtggctccgtttcagcttgcggaaacggagaatggattctcaaagttgatgaaacagat 671 

II Illlllll III MM MM II I I MM III I 

Db 541 GTTGTCTCCGTTTGAGCAATCGGATGAAGAGACGGGTATGGCTCCCATGATTAAAGCTGT 60 0 
Qy 672 gacggagtctgttggtagaagctacggtgttgtggttaacagtttttatgagctcgagtc 731 

II Ml II I II I Mill M III I I II II IIIIIIIIIII II II 

Db 601 GAGGGAATCGGATGCGAAGAGCTATGGAGTTATATTCAATAGCTTTTATGAGCTTGAATC 660 
Qy 732 gacttatgtggattattacagagaggttttgggtagaaagtcttggaatatagggcctct 791 

Mill I II llllll Mill I II Mill I MM III II M II 

Db 661 AGATTATGTTGAACATTACACTAAGGTTGTAGGTAGAAAAAATTGGGCTATTGGTCCGCT 72 0 
Qy 792 gttgttatccaacaatggcaatgaggaaaaagtacaaaggggaaaggaatctgcgattgg 851 

I I I I Ml I I I III II MM MM II I I MM I II I 

Db 721 TTCGCTGTGCAATAGGGATATTGAAGATAAAGCGGAAAGAGGGAGGAAATCATCTATCGA 780 
Qy 852 cgaacacgaatgcttggcttggttgaattccaagaagcagaattcggttgtttacgtttg 911 

Illllll llllll III I MM Mill I III Illllll Mill 

Db 781 TGAACACGCGTGCTTGAAATGGCTTGATTCGAAGAAATCAAGTTCCATTGTTTATGTTTG 84 0 
Qy 912 ttttggaagtatggcgacttttactccagcgcagttgcgcgaaactgcgattggactcga 971 

IIIIIIIIIII II III III MM Ml III III III II II II II 

Db 841 TTTTGGAAGTACAGCAGATTTCACTACAGCACAGATGCAAGAACTTGCTATGGGGCTAGA 900 
Qy 972 ggaatcaggccaagagttcatttgggtagttaaaaaggccaaaaacgaagaagaaggaaa 1031 

I II II Mill IIIIIIIIIII I I III 

Db 901 AGCCTCTGGACAAGATTTCATTTGGGTTATCA GAAC 93 6 

Qy 1032 aggaaaagaagaatggctgccagaaaattttgaggaaagagtgaaagatagaggcttgat 1091 

Ml II Mill Mill llllll II MMMIM Mill I Ml II M 

Db 93 7 AGGGAATGAAGATTGGCTCCCAGAAGGATTCGAGGAAAGAACAAAAGAAAAAGGTTTAAT 996 
Qy 1092 cataagaggatgggcgccgcaattgttgatactcgatcatcctgcggtaggagctttcgt 1151 

I I I M 1 1 M 1 1 1 1 1 1 II III MM II Mill II I Illlllll II 

Db 997 CATAAGAGGATGGGCACCCCAAAGTGTGATTCTTGATCACGAAGCTATTGGAGCTTTTGT 1056 
Qy 1152 gacgcattgtggatggaattcgacgttggaaggaatatgcgccggtgtgcctatggtgac 1211 

M MMMIMIMM Mill IMMMIMM II II II II Illlllll 

Db 1057 TACTCATTGTGGATGGAACTCGACACTGGAAGGAATATCAGCAGGGGTACCAATGGTGAC 1116 
Qy 1212 ttggccagttttcgcagagcagtttttcaatgagaagtttgtgacagaggttttggggac 1271 

Illlllll II II II MIMMMMMMIMM Mill Mill II I I 

Db 1117 ATGGCCAGTATTTGCGGAACAGTTTTTCAATGAGAAGTTGGTGACTGAGGTAATGAGAAG 1176 
Qy 1272 cggtgtttcggttgggaataagaagtggctaagggcagcaagtgaaggtgtgtcgaggga 13 31 

II I I Mill MM I III II MM Illlllll III II II 

Db 1177 TGGAGCTGGTGTTGGTTCTAAGCAATGGAAGAGAACAGCTAGTGAAGGAGTGAAAAGAGA 12 3 6 
Qy 1332 ggcagtgacgaacgcggtgcagcgtgttatggtgggagaaaatgcgtcggagatgagaaa 1391 

Ml I I II III I II I II MM Illllll II I MM 

Db 123 7 AGCAATAGCAAAGGCGATAAAGAGAGTAATGGCGAGTGAAGAAACAGAGGGATTCAGAAG 12 96 



Qy 1392 gcgagcgaagtattataaggaaatggcgaggcgggcggttgaggaaggcggttcgtctta 1451 

1 1 1 1 II I I II MINIM II II MM Mill II M Mill 

Db 12 97 CAGAGCAAAAGAGTACAAAGAAATGGCAAGAGAAGCTATTGAAGAAGGAGGATCATCTTA 1356 
Qy 1452 taatggtttgaatgagatgatagaggatttgagtgtgtaccgtgctccagaaaaacaaga 1511 

Mill I I I Mill I II I I I II III I II I II 

Db 13 57 CAATGGATGGGCTACTTTGATACAAGACATAACTTCATATCGTTAACTAGTTGATGCAAA 1416 
Qy 1512 cttaaactagattcttatagatgacttctagtgtgacaattgtaattttttgccttttat 1571 

I I I I II I II I II I I MM I I 

Db 1417 AAAAGAAAAAACATGTGTGTTTCTATATTCTGTCTTCTGTTTTGCTGATTTGATCATATT 1476 
Qy 1572 tcaagtttcctcattagtgttga 1594 
Db 1477 ACGTACTTCTTCATGATAATTAA 14 99 



RESULT 7 
AAW47172 

ID AAW47172 standard; Protein; 470 AA. 
XX 

AC AAW47172; 
XX 

DT 08-JUN-1998 (first entry) 
XX 

DE Glucosyl transferase (GTase) protein encoded by TWI1 gene. 
XX 

KW Glucosyl transferase; GTase; TWI1; tomato; signalling pathway; 

KW salicylic acid; jasrnonic acid; ethylene; wound inducible gene; 

KW plant defence protein; plant response; tobacco; rice. 
XX 

OS Lycopersicon sp . 
XX 

PN W09745546-A1. 

XX 

PD 04-DEC-1997. 
XX 

PF 30-MAY-1997; 97WO-GB01473 . 
XX 

PR 31-MAY-1996; 96GB-001142 0 . 
XX 

PA (UYYO-) UNIV YORK. 
XX 

PI Bowles DJ, Calvert CM, O f Donnell PJ, Roberts MR; 
XX 

DR WPI; 1998-032653/03. 

DR N-PSDB; AAV17054. 
XX 

PT Tomato wound inducible (TWI1) gene encoding glucosyl transferase 

PT useful to develop products that alter signalling pathways in plants 

PT by altering of salicylic acid, jasrnonic acid or ethylene 
XX 

PS Claim 2; Fig 3; 52pp; English. 
XX 

CC This is a glucosyl transferase (GTase) protein encoded by a wound 

CC inducible gene (TWI1) isolated from wounded tomatoes. The TWI1 gene 

CC encodes this GTase from amino acid position 5. The TWI1 gene can be 

CC used to identify homologue GTase encoding genes isolated from tobacco 

CC and rice. A microbial host can be transfected or transformed with a 

CC vector containing the GTase encoding nucleic acids . The products can be 

CC used to interfere with GTase and therefore alter signalling pathways in 

CC plants, specifically tobacco, rice or tomato plants by altering levels 

CC of salicylic acid, jasrnonic acid or ethylene. This can induce the 

CC production of plant defence proteins such as pathogenesis-related (PR) 

CC and proteinase inhibitor (PIN) proteins which regulate plant development 

CC (plant growth, reproduction and senescence) and improve plant response to 

CC pathogens . 

XX 

SQ Sequence 470 AA; 

Query Match 59.2%; Score 1490.5; DB 19; Length 470; 

Best Local Similarity 57.3%; Pred. No. 1.2e-144; 

Matches 274; Conservative 84; Mismatches 107; Indels 13; Gaps 3; 



Qy 1 MGKLHIALFPVMAHGHMI PMLDMAKLFTSRGIQTTI I ST LAFADPINKARDSGLDI 56 

11:11 M | | | | | | | I I I : I I : : I I I : I I : I : : I : : I 

Db 1 mgelhf f f fpddaqghmiptlcimanvvacrgvkatiittplnesvf skaiernkhlgiei 60 

Qy 57 GLSILKFPPEGSGIPDHMVSLDLV-TEDWLPKFVESLVLLQEPVEKLIEELKLDCLVSDM 115 

: :|||| : : :|: MM ::| II I::: :::: MM I : lllllll 
Db 61 dirllkfpakendlpedcerldlvpsddklpnf lkaaammkdef eeligecrpdclvsdm 12 0 

Qy 116 FLPWTVDCAAKFGIPRLVFHGTSNFALCASEQMKLHKPYKNVTSDTETFVIPDFPHELKF 175 

I II II I II II II M II II M II II : : : M I : II M II II M M II MM: 

Db 121 f lpwttdsaakf siprivf hgtsyf alcvgdtirrnkpf knvssdtetf vvpdlpheirl. 18 0 

Qy 17 6 VRTQVAP FQLAETENGFS KLMKQMTESVGRS YGVWNS FYELESTYVDYYREVLGRKSWN 235 

II I : : II : : : I I : : : I : II : II II : II II II II I M : I : I : M M I 
Db 181 trtqlspfeqsdeetgmapmikavresdaksygvifnsfyelesdyvehytkvvgrknwa 240 

Qy 236 IGPLLLSNNGNEEKVQRGKESAIGEHECLAWLNSKKQNSWYVCFGSMATFTPAQLRETA 295 

II I I I I Ml : I M : I : I II II II : I II : M II I II II I II I M : I I 

Db 241 igplslcnrdiedkaergrkssidehaclkwldskksssivyvcf gstadf ttaqmqela 300 

Qy 296 IGLEESGQEFIWWKKAKNEEEGKGKEEWLPENFEERVKDRGLIIRGWAPQLLILDHPAV 355 

: II I II M II I M : I MUM II II M : I II I I M M I Mill M 

Db 301 mgleasgqdfiwvir tgnedwlpegf eertkekgliirgwapqsvildheai 352 

Qy 356 GAFWHCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGVSVGNKKWLRAASEG 415 

II II II II I II II I II II II I M II II I II I I M I IMM : | I I : I : I I II II 
Db 353 gafvthcgwnstlegisagvpmvtwpvfaeqf fneklvtevmrsgagvgskqwkrtaseg 412 

Qy 416 VS REAVTNAVQRVMVGENAS EMRKRAKYYKEMARRAVEEGGS S YNGLNEMI EDLS VYR 473 

I MM I : : I M I I II I I I I II I I : II II I II M : I : M : II 

Db 413 vkreaiakaikrvmaseetegf rsrakeykemareaieeggssyngwatliqditsyr 470 



A68001 
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BASE COUNT 
ORIGIN 



A68001 
Sequence 1 
A68001 

A68001.1 GI:4756806 



1624 bp DNA 
from Patent W09745546. 



PAT 



05-MAY-1999 



unidentified. 

unidentified 

unclassified . 

1 (bases 1 to 1624) 

Bowles, D. J. , 0 ' donnell, P.J. , Roberts, M. R. 
USE OF A NOVEL GLUCOSYL TRANSFERASE 
Patent: WO 9745546-A 1 04-DEC-1997; 
UNIV YORK (GB) 

Location/ Qualif iers 

1. .1624 

/ organism="unidentif ied" 
/db_xref="taxon: 32644" 
545 a 252 c 372 g 455 t 



and Calvert, CM. 



Query Match 31.3%; Score 547.8; DB 9; Length 1624; 

Best Local Similarity 61.9%; Pred. No. 4.5e-119; 

Matches 942; Conservative 0; Mismatches 542; Indels 39; Gaps 3; 

Qy 87 tcacattgccttatttccagttatggctcatggtcacatgatcccaatgttggacatggc 146 

III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 TCATTTTTTCTTCTTTCCCGATGATGCTCAAGGTCATATGATACCTACACTTGACATGGC 60 

Qy 147 caagctctttacctcaagaggcatacaaacaacaatcatttcgactctcgcc 198 

II I I II III I I I I I I I I I I I I I I I 
Db 61 GAACGTTGTCGCTTGTCGTGGTGTTAAAGCCACTATAATCACAACACCTCTCAATGAATC 120 

Qy 199 ttcgctgatccgataaacaaagctcgtgattcgggcctcgatattggactaagcat 254 

I M I I I III III III II I I I I I I I I I I 

Db 121 TGTTTTCT CTAAAG C TAT T GAGAGAAACAAGCAT T TAG GT AT T GAAAT T GAT ATT C GT T T 18 0 

Qy 255 cctcaaattcccaccagaaggatcaggaataccagatcacatggtgagccttgatctagt 314 

II MINIMI I II 1 I II II I I I I I I I M II I II 

Db 181 ACTAAAATTCCCAGCTAAGGAGAATGATTTGCCTGAAGATTGTGAGCGTCTTGATCTTGT 240 

Qy 315 tactgaagattggctcccaaagtttgttgagtcattagtcttattacaagagccagt 371 

I I I I I II M I I II I I I I I I I I I I I I II II 

Db 241 AC CT T CT GAT GACAAACT C C CAAACT T CT TAAAAG CT GC GGCT AT GAT GAAAGAT GAAT T 300 

Qy 372 tgagaagcttatcgaagaactaaagctcgactgtctcgtttccgacatgttcttgccttg 431 

M M M M II I I II I I I I I M I I I II II I M I I II I I I I I II 

Db 301 TGAGGAGCTTATTGGAGAATGTCGCCCTGATTGTCTTGTTTCTGATATGTTCCTTCCATG 360 

Qy 432 gacagtcgattgtgcggctaagttcggtattccgaggttggttttccacggaacgagcaa 491 

III I I I I I I I II II I I I I I II I I I I I I I I I I I M II I I I 

Db 361 GACT ACT GAT AGT GCAG C CAAAT T TAG CAT AC CAAGAAT T GT AT T C CAT G GAACT AGT T A 420 

Qy 492 ctttgcgttgtgtgcttcggagcaaatgaagcttcacaagccttataagaatgtaacttc 551 

I I I I II I II I I I II I I I II I I I II I I II I I I I I I M III 

Db 421 CTTTGCCCTTTGTGTTGGC GAT AC GAT CAG G C GTAATAAGC CT T T CAAGAAT GT GT CAT C 480 



Qy 552 tgatactgagacatttgttataccggatttcccgcatgagctgaagtttgtgaggactca 611 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 481 G GATACT GAAACT T T T GT T GT AC C GGAT T T G C CACAT GAAAT TAGGCT AACT AGAACACA 540 

Qy 612 agtggctccgtttcagcttgcggaaacggagaatggattctcaaagttgatgaaacagat 671 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II J I 

Db 541 GTTGTCTCCGTTTGAGCAATCGGATGAAGAGACGGGTATGGCTCCCATGATTAAAGCTGT 600 

Qy 672 gacggagtctgttggtagaagctacggtgttgtggttaacagtttttatgagctcgagtc 731 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 GAGGGAATCGGATGCGAAGAGCTATGGAGTTATATTCAATAGCTTTTATGAGCTTGAATC 660 

Qy 732 gacttatgtggattattacagagaggttttgggtagaaagtcttggaatatagggcctct 7 91 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 661 AGATTATGTTGAACATTACACTAAGGTTGTAGGTAGAAAAAATTGGGCTATTGGTCCGCT 720 

Qy 7 92 gttgttatccaacaatggcaatgaggaaaaagtacaaaggggaaaggaatctgcgattgg 851 

I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I 
Db 721 TTCGCTGTGCAATAGGGATATTGAAGATAAAGCGGAAAGAGGGAGGAAATCATCTATCGA 7 80 

Qy 852 cgaacacgaatgcttggcttggttgaattccaagaagcagaattcggttgtttacgtttg 911 

I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I 111111111111 
Db 781 T GAACAC GC GT GCT T GAAAT G G CT T GAT T C GAAGAAAT CAAGT T C CAT T GT T TAT GT T T G 840 

Qy 912 ttttggaagtatggcgacttttactccagcgcagttgcgcgaaactgcgattggactcga 971 

M I II I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I I I 
Db 841 T T T T GGAAGT ACAG CAGAT T T CACT ACAG CACAGAT G CAAGAACT T GCT AT GGG G CT AGA 900 

Qy 972 ggaatcaggccaagagttcatttgggtagttaaaaaggccaaaaacgaagaagaaggaaa 1031 

I M I I I I I I I I I I I I I I I I II II III 

Db 901 AGCCTCTGGACAAGATTTCATTTGGGTTATCA GAAC 936 

Qy 1032 aggaaaagaagaatggctgccagaaaattttgaggaaagagtgaaagatagaggcttgat 1091 

III II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I Ml II II 
Db 937 AG G GAAT GAAGAT T G G C T C C C AGAAG GAT T C GAG GAAAGAACAAAAGAAAAAG GT T T AAT 996 

Qy 1092 cataagaggatgggcgccgcaattgttgatactcgatcatcctgcggtaggagctttcgt 1151 

I I I I I I I I I I I I I I I I I II I I I I I M I I I I I II I I I II I I I I I I 

Db 997 CATAAGAGGAT GGGCAC C C CAAAGT GT GAT T C T T GAT CAC GAAG C TAT T G GAGCT T T T GT 1056 

Qy 1152 gacgcattgtggatggaattcgacgttggaaggaatatgcgccggtgtgcctatggtgac 1211 

II I I I I I I I I I I I I II II I I I I I I M I I I I I I I II II II II I I I I II I I 

Db 1057 TACT CAT T GT GGAT GGAAC T C GACACT G GAAGGAAT AT CAG CAGGGGT AC CAAT G GT GAC 1116 

Qy 1212 ttggccagttttcgcagagcagtttttcaatgagaagtttgtgacagaggttttggggac 1271 
M M M M M M II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I I 

Db 1117 AT G G C CAGTAT T T G C G GAACAGT T T T T CAAT GAGAAGT T G GT GACT GAGGT AAT GAGAAG 1176 

Qy 1272 cggtgtttcggttgggaataagaagtggctaagggcagcaagtgaaggtgtgtcgaggga 1331 

I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I II I I I I 
Db 1177 T GGAGCT GGT GTT GGTT CTAAGCAAT GGAAGAGAACAGCTAGT GAAGGAGT GAAAAGAGA 1236 

Qy 1332 ggcagtgacgaacgcggtgcagcgtgttatggtgggagaaaatgcgtcggagatgagaaa 1391 

I I I I I I I I I I I II I I I I M I I I M I I I II I I I I I 
Db 1237 AG CAAT AGCAAAGGC GATAAAGAGAGTAAT G GC GAGT GAAGAAACAGAGGGAT T CAGAAG 1296 



Qy 1392 gcgagcgaagtattataaggaaatggcgaggcgggcggttgaggaaggcggttcgtctta 1451 

I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I II II I I I I I 

Db 1297 CAGAGCAAAAGAGTACAAAGAAATGGCAAGAGAAGCTATTGAAGAAGGAGGATCATCTTA 1356 

Qy 1452 taatggtttgaatgagatgatagaggatttgagtgtgtaccgtgctccagaaaaacaaga 1511 

I I I II I I I II I I I I I I III I I I I I III I M 

Db 1357 CAAT GGAT G G GC T AC T T T GAT ACAAGACAT AACT T CAT AT C GT TAACTAGT T GAT GCAAA 1416 

Qy 1512 cttaaactagattcttatagatgacttctagtgtgacaattgtaattttttgccttttat 1571 

I I I I II I II Mill Nil II 

Db 1417 AAAAGAAAAAACAT GT GT GT T T CTATAT TCTGTCTTCTGTTTTGCT GAT T T GAT CAT AT T 147 6 

Qy 1572 tcaagtttcctcattagtgttga 1594 

I I I I I I I I I III 

Db 1477 AC GT ACT T CT T CAT GATAAT TAA 14 99 



RESULT 1 
US-09-106-464-2 

Sequence 2, Application US/09106464 
Patent No. 6011145 
GENERAL INFORMATION: 

APPLICANT: Steffens, John C. 
APPLICANT: Ghangas, Gurdev S. 
APPLICANT: Kuai, Jian-Ping 
APPLICANT : Eannetta , Nancy 

TITLE OF INVENTION: Chain Length Specific UDP-Glc : Fatty Acid 
TITLE OF INVENTION: Glucosyltranf erases 
NUMBER OF SEQUENCES: 2 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Jones, Tullar & Cooper, P.C. 
STREET: P.O. Box 2266 Eads Station 
CITY: Arlington 
STATE: Virginia 
COUNTRY : USA 
ZIP : 22202 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS -DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 106 , 464 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 60/055,554 
FILING DATE: 13-AUG-1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Spector, Eric S. 
REGISTRATION NUMBER: 22495 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 703-415-1500 
TELEFAX: 703-415-15 0 8 
INFORMATION FOR SEQ ID NO : 2 : 
SEQUENCE CHARACTERISTICS: 
LENGTH: 471 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-09-106-464-2 



Query Match 60.3%; Score 1516.5/ DB 3/ Length 471; 

Best Local Similarity 58.2%; Pred. No. 5.8e-155; 

Matches 278 ; Conservative 84 ; Mismatches 103 ; Indels 13 ; Gaps 

Qy 1 MGKLHIALFPVMAHGHMIPMLDMAKLFTSRGIQTTIIST L AF AD P I NKARD S GLD I 56 

Ihll Ihll Mill MINI lh: 111 = 1 h I : = Ml 

Db 2 MGQLHFFFFPMMAQGHMIPTLDMAKLVACRGVKATIITTPLNESVFSKAIERNKHLGIEI 61 

Qy 57 GLSILKFPPEGSGIPDHMVSLDLV-TEDWLPKFVESLVLLQEPVEKLIEELKLDCLVSDM 115 

: Hill = = =h MM -I II l = = = =: = = Ml I = 

Db 62 DIRLLKFPAKENDLPEDCERLDLVPSDDKLPNFLKAAAMMKDEFEELIGECRPDCLVSDM 121 



Qy 116 FLPWTVDCAAKFGIPRLVFHGTSNFALCASEQMKLHKPYKNVTSDTETFVIPDFPHELKF 175 

Mill I llll llhllllll llll = =: M M II M I I II II I : I I II I I : : 
Db 122 FLPWTTDSAAKFSIPRIVFHGTSYFALCVGDSIRRNKPFKNVSSDTETFWPDFPHEIRL 181 

Qy 176 VRTQVAPFQLAETENGFSKLMKQMTESVGRSYGVWNSFYELESTYVDYYREVLGRKSWN 235 

|||::||: I I : -I : II = I I I I = I I I I I I I I I I = = I = h I I h I 

Db 182 TRTQLSPFEQSDEETGMAPMIKAVRESDAKSYGVIFNSFYELESDYVEHYTKWGRKNWA 241 

Qy 2 3 6 IGPLLLSNNGNEEKVQRGKESAIGEHECLAWLNSKKQNSWYVCFGSMATFTPAQLRETA 2 95 

llll II II :|h:hl II II Ihlll MINIMI I II 1 h = i I 
Db 242 IGPLSLCNRD I E YKAERGRKS S IDEHACLKWLDS KKS S S I VYVCFGSTADFTTAQMQELA 3 01 

Qy 296 IGLEESGQEFIWWKKAKNEEEGKGKEEWLPENFEERVKDRGLIIRGWAPQLLILDHPAV 355 

:||| 111 = 1111 = = I MMM llll -II MINIMUM M 

Db 302 MGLEASGQDFIWVIR TGNEDWLPEGFEERTKEKGLIIRGWAPQVLILDHEAI 353 

Qy 356 GAFVTHCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGVSVGNKKWLRAASEG 415 

IIIIMIIIIIIIIII 1 1 1 1 1 = 1 1 1 E 1 1 1 1 1 1 ! 1 1 MM =1 MM I llll 

Db 354 GAFVTHCGWNSTLEGISAGVPMLTWPVFAEQFFNEKLVTEVMRSGAGVGSKQWKRTASEG 413 

Qy 416 VSRE AVTNAVQRVMVGENAS EMRKRAKY YKEMARRAVEEGGS S YNGLNEM I EDLS VYR 473 

I Mh MMM I I Ml I I I I I I I = I I I I I I I I I MM- II 

Db 414 VKREAI AKAI KRVMASEETEGFRSRAKE YKEMAREAI EEGGS S YNGWATL I QD ITS YR 471 



RESULT 2 
US-08-797-226-2 

Sequence 2, Application US/08797226 
Patent No. 5959180 
GENERAL INFORMATION: 



MOEHS, CHARLES P 
ALLEN, PAUL V 
ROCKHOLD, DAVID R 
STAPLETON, ANDREW 
GARBARINO, JOAN E 
FRIEDMAN, MENDEL 
BELKNAP, WILLIAM R 



DNA SEQUENCES ENCODING SOLANIDINE 

UDP -GLUCOSE GLUCOSYLTRANSFERASE AND USE TO REDUCE 
GLYCOALKALOIDS IN SOLANACEOUS PLANTS 
2 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
TITLE OF INVENTION 
TITLE OF INVENTION 
TITLE OF INVENTION 
NUMBER OF SEQUENCES 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: NANCY J. PARSONS 
STREET: 800 BUCHANAN ST. 
CITY: ALBANY 
STATE : CA 
COUNTRY : USA 
ZIP: 94710 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS -DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/7 97 , 22 6 
FILING DATE: 



CLASSIFICATION: 536 
ATTORNEY/ AGENT INFORMATION: 
NAME: PARSONS, NANCY J 
REGISTRATION NUMBER: 40,364 
REFERENCE/ DOCKET NUMBER: 0011.97 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (510) 559-5731 
TELEFAX: (510) 559-5777 
; INFORMATION FOR SEQ ID NO : 2 : 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 488 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-797-226-2 



Query Match 34.9%; Score 8 77; DB 2; Length 488; 

Best Local Similarity 38.1%; Pred. No. 6.7e-86; 

Matches 191; Conservative 100; Mismatches 152; Indels 58; Gaps 

Qy 4 LHIALFPVMAHGHMIPMLDMAKLFTSRGIQTTIIST LAFADPI -NKARDSGLDIGL 58 

lh I =: II ||::= hll III- M-i II i : I II I = 

Db 11 LHVLFLPFLSAGHFIPLVNAARLFASRGVKATILTTPHNALLFRSTIDDDVRISGFPISI 70 

Qy 59 S ILKFPPEGSGI PDHMVSLDLVTEDWLP - KFVESLVLLQEPVEKLIEELKLDCLVSDMFL 117 

Db 71 VTIKFPSAEVGLPEGIESFNSATSPEMPHKIFYALSLLQKPMEDKIRELRPDCIFSDMYF 13 0 

Qy 118 PWTVDCAAKFGI PRLVFHGTSNFALCASEQMKLHKPYKNVTSD - TETFVI PDFPHELKFV 176 

Db 131 PWTVDIADELHIPRILYNLSAYMCYSIMHNLKVYRPHKQPNLDESQSFWPGLPDEIKFK 190 

Qy 177 RTQVAPFQLAETENG F S KLMKQMTES VGRS YG VWNS F YELE S T YVD Y YRE VLGRKS 233 

Db 191 LSQLTD-DLRKSDDQKTVFDELLEQVEDSEERSYGIVHDTFYELEPAYVDYYQKLKKPKC 24 9 

Qy 234 WNIGPL LLSNNGNEEKVQRGKESAIGEHECLAWLNSKKQNSWYVCFGS 2 82 

h III I : I : I I I : 111 = = ! ! I = ! I III 

Db 250 WHFGPLSHFASKIRSKELISEHNNNEIV IDWLNAQKPKS VL YVS FGS 296 

Qy 283 MATFTPAQLRETAIGLEESGQEFIWWKKAKNEEEGKGKEEWLP-ENFEERVKDRGLIIR 341 

Db 297 MARFPESQLNE I AQALDASNVPF I FVLR - - PNEETA SWLPVGNLEDKTK- KGLYIK 349 

Qy 342 GWAPQLLILDHPAVGAFVTHCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGV 401 

II III I I hllll II II I ||:| |::;:M:| I || I h 

Db 350 GWVPQLTIMEHSATGGFMTHCGTNSVLEAITFGVPMITWPLYADQFYNEK-WEVRGLGI 4 08 

Qy 402 SVGNKKWLRAASEG VSREAVTNAVQRVMVGENASE MRKRAKYYKEMARRA 451 

= | | : | | : : |::|:|: : | :| | :||: | 

Db 409 KIGIDVW NEGIEITGPVIESAKIREAIERLMISNGSEEIINIRDRVMAMSKMAQNA 464 

Qy 452 VEEGGS S YNGLNEM I EDL S VY 472 

I I I I I I I :h : I 
Db 465 TNEGGS S WNNLTAL I QHI KNY 485 



RESULT 1 
US-09-106-464-1 

Sequence 1, Application US/09106464 
Patent No. 6011145 
GENERAL INFORMATION: 

APPLICANT: Steffens, John C. 
APPLICANT: Ghangas , Gurdev S. 
APPLICANT: Kuai, Jian-Ping 
APPLICANT: Eannetta, Nancy 

TITLE OF INVENTION: Chain Length Specific UDP-Glc : Fatty Acid 
TITLE OF INVENTION: Glucosyltranf erases 
NUMBER OF SEQUENCES: 2 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Jones, Tullar & Cooper, P.C. 
STREET: P.O. Box 2266 Eads Station 
CITY: Arlington 
STATE: Virginia 
COUNTRY : USA 
ZIP : 22202 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS -DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/ 106 , 4 64 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 60/055,554 
FILING DATE: 13 -AUG- 1997 
ATTORNEY/ AGENT INFORMATION: 
NAME: Spector, Eric S. 
REGISTRATION NUMBER: 22495 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 703-415-1500 
TELEFAX: 7 03-415-1508 
INFORMATION FOR SEQ ID NO : 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 162 7 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: CDNA 
FEATURE : 

NAME /KEY: CDS 
LOCATION: 1. .1413 
US-09-106-464-1 



Query Match 32.3%; 
Best Local Similarity 62.4%; 
Matches 958; Conservative 



Score 565.6; DB 3 
Pred. No. 7.8e-158 
0; Mismatches 539 



Length 1627; 
Indels 39; Gaps 



3; 



Qy 74 aaatgggaaaacttcacattgccttatttccagttatggctcatggtcacatgatcccaa 133 

IIIIIII I II II II III Mill I MINIM Mill Mill II I 

Db 2 AAATGGGTCAGCTACATTTTTTCTTCTTTCCCATGATGGCTCAAGGTCATATGATACCTA 61 



Qy 134 tgttggacatggccaagctctttacctcaagaggcatacaaacaacaatcatttcgactc 193 

i iiiiiiii him in i ii i ii i ii ii ii i ii i 

Db 62 CACTTGACATGGCGAAGCTTGTCGCTTGTCGTGGTGTTAAAGCCACTATAATCACAACAC 121 
Qy 194 tcgcc ttcgctgatccgataaacaaagctcgtgattcgggcctcgata 241 

i mi ii i i ii i i i mi ii i ii i 

Db 122 CTCTCAATGAATCTGTTTTCTCTAAAGCTATTGAGAGAAACAAGCATTTAGGTATTGAAA 181 
Qy 242 ttggactaagcatcctcaaattcccaccagaaggatcaggaataccagatcacatggtga 3 01 

Ml I I I II I MM I 111 I I I I llllll II 

Db 182 TTGATATTCGTTTACTAAAATTCCCAGCTAAGGAGAATGATTTGCCTGAAGATTGTGAGC 241 
Qy 302 gccttgatctagt tactgaagattggctcccaaagtttgttgagtcattagtcttat 358 

i iiiiiiii ii i 1 1 1 1 ii iiiiiiii ii I I i ii 

Db 242 GTCTTGATCTTGTACCTTCTGATGACAAACTCCCAAACTTCTTAAAAGCTGCGGCTATGA 3 01 
Qy 359 tacaagagccagttgagaagcttatcgaagaactaaagctcgactgtctcgtttccgaca 418 

I MM I MIM MIMM I MM I M Mill Mill M I 

Db 3 02 TGAAAGATGAATTTGAGGAGCTTATTGGAGAATGTCGCCCTGATTGTCTTGTTTCTGATA 3 61 
Qy 419 tgttcttgccttggacagtcgattgtgcggctaagttcggtattccgaggttggttttcc 478 

Mill I II MIM III MM II II M I II II II I II MM 

Db 3 62 TGTTCCTTCCATGGACTACTGATAGTGCAGCCAAATTTAGCATACCAAGAATTGTATTCC 421 
Qy 479 acggaacgagcaactttgcgttgtgtgcttcggagcaaatgaagcttcacaagccttata 538 

I Mill II IIIIIIII I MM I 1 1 lllllll lllllll I 

Db 422 ATGGAACTAGTTACTTTGCGCTTTGTGTTGGCGATAGCATCAGGCGTAATAAGCCTTTCA 4 81 
Qy 539 agaatgtaacttctgatactgagacatttgttataccggatttcccgcatgagctgaagt 598 

MMMI I II IIIIIIII II IIMM llllllllll II Mill I I I 

Db 4 82 AGAATGTGTCATCGGATACTGAAACTTTTGTTGTACCGGATTTTCCACATGAAATTAGGC 541 
Qy 599 ttgtgaggactcaagtggctccgtttcagcttgcggaaacggagaatggattctcaaagt 658 

I II II II II IIIIIIII III MM MM II I I 

Db 542 TAACTAGAACACAGTTGTCTCCGTTTGAGCAATCGGATGAAGAGACGGGTATGGCTCCCA 601 
Qy 659 tgatgaaacagatgacggagtctgttggtagaagctacggtgttgtggttaacagttttt 718 

MM Ml Ml III II I II I Mill M III I I II II MM 

Db 602 TGATTAAAGCTGTGAGGGAATCGGATGCGAAGAGCTATGGAGTTATATTCAATAGCTTTT 661 
Qy 719 atgagctcgagtcgacttatgtggattattacagagaggttttgggtagaaagtcttgga 778 

lllllll II II llllll II MMM Mill I IIIIIIII MM 

Db 662 ATGAGCTTGAATCAGATTATGTTGAACATTACACTAAGGTTGTAGGTAGAAAAAATTGGG 721 

Qy 779 atatagggcctctgttgttatccaacaatggcaatgaggaaaaagtacaaaggggaaagg 838 

III II II II I I I I II! I I I III I MM MM II I I 

Db 722 CTATTGGTCCGCTTTCGCTGTGCAATAGGGATATTGAATATAAAGCGGAAAGAGGGAGGA 781 
Qy 83 9 aatctgcgattggcgaacacgaatgcttggcttggttgaattccaagaagcagaattcgg 898 

MM I II I lllllll llllll III I MM Mill I III 

Db 7 82 AATCATCTATCGATGAACACGCGTGCTTGAAATGGCTTGATTCGAAGAAATCAAGTTCCA 841 
Qy 899 ttgtttacgtttgttttggaagtatggcgacttttactccagcgcagttgcgcgaaactg 958 

lllllll llllllllllllllll ii mi mi 1 1 1 1 mi mi in ii 

Db 842 TTGTTTATGTTTGTTTTGGAAGTACAGCAGATTTCACTACAGCACAGATGCAAGAACTTG 901 



Qy 959 cgattggactcgaggaatcaggccaagagttcatttgggtagttaaaaaggccaaaaacg 1018 

I II II II II I II II Mill IMIIIIIIII I I 

Db 902 CTATGGGGCTAGAAGCCTCTGGACAAGATTTCATTTGGGTTATCA 94 6 

Qy 1019 aagaagaaggaaaaggaaaagaagaatggctgccagaaaattttgaggaaagagtgaaag 1078 

III III II Mill Mill M II II II MINIM MM 

Db 947 GAACAGGGAATGAAGATTGGCTCCCAGAAGGATTCGAGGAAAGAACAAAAG 997 

Qy 1079 atagaggcttgatcataagaggatgggcgccgcaattgttgatactcgatcatcctgcgg 1138 

'Mill II MIIIIIIIMIMIII II III II MM II Mill II 

Db 998 AAAAAGGTTTAATCATAAGAGGATGGGCACCCCAAGTGCTGATTCTTGATCACGAAGCTA 1057 
Qy 113 9 taggagctttcgtgacgcattgtggatggaattcgacgttggaaggaatatgcgccggtg 1198 

I llllll II M MMMIMIMM Mill IIIIIIIMIII II M I 

Db 1058 TTGGAGCTTTTGTTACTCATTGTGGATGGAACTCGACACTGGAAGGAATATCAGCAGGGG 1117 
Qy 1199 tgcctatggtgacttggccagttttcgcagagcagtttttcaatgagaagtttgtgacag 1258 

I II III MM llllll II II II 1 1 1 i 1 1 1 1 M I M M 1 1 1 1 1 Mill I 

Db 1118 TACCAATGTTGACATGGCCAGTATTTGCGGAACAGTTTTTCAATGAGAAGTTGGTGACTG 1177 
Qy 1259 aggttttggggaccggtgtttcggttgggaataagaagtggctaagggcagcaagtgaag 1318 

MM MM II II IMM MM I III II MM MM 

Db 117 8 AGGTAATGAGAAGTGGAGCTGGTGTTGGTTCTAAGCAATGGAAGAGAACAGCTAGTGAAG 123 7 
Qy 1319 gtgtgtcgagggaggcagtgacgaacgcggtgcagcgtgttatggtgggagaaaatgcgt 1378 

I III M M III I I II III I II I II MM M III I I 

Db 123 8 GAGTGAAAAGAGAAGCAATAGCAAAGGCGATAAAGAGAGTAATGGCGAGTGAAGAAACAG 12 97 
Qy 13 7 9 cggagatgagaaagcgagcgaagtattataaggaaatggcgaggcgggcggttgaggaag 143 8 

II I MM MM M I II II NIL M . M MM MM 

Db 12 98 AGGGATTCAGAAGCAGAGCAAAAGAGTACAAAGAAATGGCAAGAGAAGCTATTGAAGAAG 13 57 
Qy 143 9 gcggttcgtcttataatggtttgaatgagatgatagaggatttgagtgtgtaccgtgctc 14 98 

I II M Mill II II I I I I MMI llllll IMM I 

Db 1358 GAGGATCATCTTACAATGGATGGGCTACTTTGATACAAGACATAACTTCATATCGTTAAC 1417 
Qy 1499 cagaaaaacaagacttaaactagattcttatagatgacttctagtgtgacaattgtaatt 1558 

II I MM I I M II I I I I II I I 

Db 1418 TAGTGATGCAAAAAAAAGAAAAAACATGTGTGTTTCTATATTCTGTCTTCTGTTTTGCTG 1477 
Qy 1559 ttttgccttttattcaagtttcctcattagtgttga 1594 

MM III Ml MM I II I 

Db 1478 ATTTGATCATATTACGTACTTCTTCATCATAATTAA 1513 
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induction by other compounds. 

AUTHOR (S) : Horvath, D.M.; Chua, N.H. 

CORPORATE SOURCE: The Rockfeller University, New York, NY. 
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FILE SEGMENT: Non-U. S. Imprint other than FAO 

LANGUAGE: English 

AB Tobacco genes that are induced in response to salicylic acid (SA) 

treatment with immediate -early kinetics were identified by differential 
mRNA display. Detailed analysis of ISlOa, one cDNA clone identified by 
this method, revealed induction within 30 min of treatment, with a peak 



of 



of 



expression at 3 h, that decayed rapidly thereafter. Treatment with the 
protein synthesis inhibitor, cycloheximide (CHX) , also caused induction 



ISlOa mRNA to comparable levels, but the ISlOa mRNA continued to 
accumulate after 3 h of induction. In combination, CHX and SA led to a 
superinduction of ISlOa mRNA levels that was also sustained. Half -maximal 
induction was evident at ca. 100-150 micromolar SA. In addition to SA, 
induction of ISlOa occurred to varying degrees upon treatment with 
acetylsalicylic acid, benzoic acid, 2 , 4 -dichlorophenoxyacetic acid, 
methyl 

jasmonate, and hydrogen peroxide, whereas treatment with other compounds 
had no effect. The proteins encoded by ISlOa and a second highly 
homologous cDNA show sequence similarity to UDP-glucose: flavonoid 
glucosyltransf erases . 



RESULT 2 
NTU32644 

LOCUS NTU32644 1624 bp mRNA PLN 25-NOV-1996 

DEFINITION Nicotiana tabacum immediate-early salicylate-induced 

glucosyltransf erase (IS5a) mRNA, complete cds . 
ACCESSION U32644 

VERSION U32644.1 GI: 1685004 

KEYWORDS 

SOURCE common tobacco. 

ORGANISM Nicotiana tabacum 

Eukaryota ; Viridiplantae ; S t reptophyta ; Embryophyt a ; Tracheophyt a ; 
Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 
Asteridae; euasterids I; Solanales; Solanaceae; Nicotiana. 
REFERENCE 1 (bases 1 to 1624) 

AUTHORS Horvath,D.M. and Chua,N.H. 

TITLE Identification of an immediate-early salicylic acid-inducible 

tobacco gene and characterization of induction by other compounds 
JOURNAL Plant Mol . Biol. 31 (5), 1061-1072 (1996) 
MEDLINE 97000918 
REFERENCE 2 (bases 1 to 1624) 

AUTHORS Horvath,D.M. and Chua,N.H. 
TITLE Direct Submission 

JOURNAL Submitted (29- JUL-1995) Diana M. Horvath, Laboratory of Plant 

Molecular Biology, The Rockefeller University, 1230 York Avenue, 
New York, NY 10021, USA 
FEATURES Location/Qualif iers 

source 1. .1624 

/organism^ "Nicotiana tabacum" 
/strain= "bright yellow 2" 
/ db__xre f = » t axon : 4 0 9 7 « 
gene 101. .1531 

/gene="IS5a" 
CDS 101. .1531 

/gene="IS5a" 
/codon_start-l 

/product= "immediate -early salicylate-induced 
glucosyltransf erase" 
/protein_id="AAB36653 ,1" 
/db_xref="GI: 1685005" 

/ trans lation= "MGQLHIFFFPVMAHGHMIPTLDMAKLFASRGVKATIITTPLNEF 
VFSKAIQRNKHLGIEIEIRLIKFPAVENGLPEECERLDQIPSDEKLPNFFKAVAMMQE 
PLEQLIEECRPDCLISDMFLPWTTDTAAKFNIPRIVFHGTSFFALCVENSVRLNKPFK 
NVSSDSETFWPDLPHEIKLTRTQVSPFERSGEETAMTRMIKTVRESDSKSYGWFNS 
FYELETDYVEHYTKVLGRRAWAIGPLSMCNRDIEDKAERGKKSSIDKHECLKWLDSKK 
PSSWYICFGSVANFTASQLHELAMGVEASGQEFIWWRTELDNEDWLPEGFEERTKE 
KGLIIRGWAPQVLILDHESVGAFVTHCGWNSTLEGVSGGVPMVTWPVFAEQFFNEKLV 
TEVLKTGAGVGSIQWKRSASEGVKREAIAKAIKRVMVSEEADGFRNRAKAYKEMARKA 
IEEGGSSYTGLTTLLEDISTYSSTGH" 

BASE COUNT 498 a 293 c 365 g 468 t 

ORIGIN 



Query Match 3 3.6%; 

Best Local Similarity 64.4%; 
Matches 944; Conservative 



Score 5 8 8.2; DB 14; Length 1624; 
Pred. No. 1.3e-128; 
0; Mismatches 488; Indels 33; Gaps 



3; 



Qy 6 8 ttttaaaaatgggaaaacttcacattgccttatttccagttatggctcatggtcacatga 127 

III I lllllll I 1 1 1 1 III III Mill II Ml MM 

Db 93 TTTCACAAATGGGTCAGCTCCATATTTTCTTCTTTCCTGTGATGGCTCATGGCCACATGA 152 

Qy 12 8 tcccaatgttggacatggccaagctctttacctcaagaggcatacaaacaacaatcattt 187 

I II I I MINIM MIIIIMI I III I II I I I II M M 

Db 153 TTCCTACACTAGACATGGCGAAGCTCTTTGCTTCACGTGGTGTTAAGGCCACTATAATCA 212 
Qy 188 cgactc tcgccttcgctgatccgataaacaaagctcgtgattcgggcc 235 

Ml I III III I I Ml I I I III III 

Db 213 CAACCCCACTCAATGAATTCGTTTTCTCCAAAGCTATTCAAAGAAACAAGCATTTGGGTA 272 
Qy 236 tcgatattggactaagcatcctcaaattcccaccagaaggatcaggaataccagatcaca 2 95 

MM II II I I I 1 1 1 1 1 1 1 M 1 1 I I II II MM II I 

Db 2 73 TCGAAATCGAAATCCGTTTGATCAAATTCCCAGCTGTTGAAAACGGCTTACCTGAAGAAT 332 
Qy 2 96 tggtgagccttgatctagt tactgaagattggctcccaaagtttgttgagtcattag 3 52 

I MM MM II I I II II 1 1 1 1 Ill I I I Ml 

Db 333 GCGAACGCCTCGATCAAATCCCTTCAGATGAGAAGCTCCCAAACTTTTTCAAAGCTGTAG 3 92 
Qy 353 tcttattacaagagccagttgagaagcttatcgaagaactaaagctcgactgtctcgttt 412 

I I Mill III I II lllllll MIMI I Ml MMI Ml 

Db 3 93 CTATGATGCAAGAACCACTAGAACAGCTTATTGAAGAATGTCGCCCCGATTGTCTTATTT 452 
Qy 413 ccgacatgttcttgccttggacagtcgattgtgcggctaagttcggtattccgaggttgg 472 

I II MIMI I MMMM III III II II II lllllll li 

Db 453 CAGATATGTTCCTTCCTTGGACTACTGATACTGCAGCAAAATTTAACATTCCAAGAATAG 512 
Qy 473 ttttccacggaacgagcaactttgcgttgtgtgcttcggagcaaatgaagcttcacaagc 532 

I II II II II III MIMI I MM III II III I MM 

Db 513 TCTTTCATGGCACAAGCTTCTTTGCTCTTTGTGTTGAGAATAGCGTCAGGCTAAATAAGC 572 
Qy 533 cttataagaatgtaacttctgatactgagacatttgttataccggatttcccgcatgagc 592 

III MMMM I II III MM II MIMI MIIIIMM II II II 

Db 573 CTTTCAAGAATGTGTCCTCAGATTCTGAAACTTTTGTTGTACCGGATTTGCCTCACGAAA 632 
Qy 5 93 tgaagtttgtgaggactcaagtggctccgtttcagcttgcggaaacggagaatggattct 652 

I III I II II II Ml MMMM III I I MM I I 

Db 63 3 TTAAGCTGACCAGAACCCAGGTGTCTCCGTTTGAGCGATCTGGGGAAGAGACGGCTATGA 6 92 
Qy 653 caaagttgatgaaacagatgacggagtctgttggtagaagctacggtgttgtggttaaca 712 

I I MM III MIIIIMI I II II I II II II I I MM 

Db 693 CCCGGATGATAAAAACAGTCAGGGAATCAGATTCAAAGAGCTATGGAGTTGTTTTCAACA 752 
Qy 713 gtttttatgagctcgagtcgacttatgtggattattacagagaggttttgggtagaaagt 772 

MM MMMM II I MUM II MM I MM MIIIIMI 

Db 753 GTTTCTATGAGCTTGAAACAGATTATGTTGAGCATTATACTAAGGTGCTGGGTAGAAGAG 812 
Qy 773 cttggaatatagggcctctgttgttatccaacaatggcaatgaggaaaaagtacaaaggg 832 

I II II III II Mill MM II II I I II III II MM MM I 

Db 813 CTTGGGCTATTGGCCCTCTATCGATGTGCAACAGGGACATTGAAGATAAAGCTGAAAGAG 872 

Qy 83 3 gaaaggaatctgcgattggcgaacacgaatgcttggcttggttgaattccaagaagcaga 892 

Mill MM I MM lllllll MIMI III I MM Mill I I 

Db 873 GAAAGAAATCCTCTATTGATAAACACGAGTGCTTGAAATGGCTTGATTCGAAGAAACCAA 932 



Qy 



893 attcggttgtttacgtttgttttggaagtatggcgacttttactccagcgcagttgcgcg 952 



Db 93 3 GTTCCGTCGTTTACATTTGTTTTGGAAGCGTAGCGAATTTCACTGCATCACAACTGCACG 992 
Qy 953 aaactgcgattggactcgaggaatcaggccaagagttcatttgggtagttaaaaaggcca 1012 

II III II III I II I II II Mill IIIIIIIIIII 1 1 1 1 II 

Db 993 AACTTGCTATGGGAGTTGAAGCTTCCGGACAAGAATTCATTTGGGTTGTTAGAA 1046 

Qy 1013 aaaacgaagaagaaggaaaaggaaaagaagaatggctgccagaaaattttgaggaaagag 1072 

II I II II Mill III MM III II lllllllll 

Db 1047 CAGAACTAGACAACGAAGATTGGTTGCCTGAAGGATTCGAGGAAAGAA 1094 

Qy 1073 tgaaagatagaggcttgatcataagaggatgggcgccgcaattgttgatactcgatcatc 113 2 

MUM I III II II MMIMMIMM II III I I II II Mill 

Db 1095 CGAAAGAGAAAGGTTTAATAATAAGAGGATGGGCACCCCAAGTACTAATTCTTGATCACG 1154 
Qy 1133 ctgcggtaggagctttcgtgacgcattgtggatggaattcgacgttggaaggaatatgcg 1192 

I II llllllll II II MMMM MMMM M' I IIIMI I I I 

Db 1155 AATCTGTGGGAGCTTTTGTTACACATTGTGGTTGGAATTCAACACTAGAAGGAGTTTCAG 1214 
Qy 1193 ccggtgtgcctatggtgacttggccagttttcgcagagcagtttttcaatgagaagtttg 1252 

II 1 1 M Mill II Mill M II II Mill lllllllllllllllll I 

Db 1215 GAGGGGTTCCAATGGTAACATGGCCTGTATTTGCTGAGCAATTTTTCAATGAGAAGTTAG 1274 
Qy 1253 tgacagaggttttggggaccggtgtttcggttgggaataagaagtggctaagggcagcaa 1312 

MM MMMMI II II I I Mill I I III II MM I 

Db 12 75 TGACTGAGGTTTTGAAAACTGGAGCTGGTGTTGGTTCGATACAATGGAAGAGATCAGCTA 1334 
Qy 1313 gtgaaggtgtgtcgagggaggcagtgacgaacgcggtgcagcgtgttatggtgggagaaa 13 72 

Null III II II III I I II II I II I II IIIMI I III 

Db 13 3 5 GTGAAGGAGTGAAAAGAGAAGCAATAGCTAAGGCAATAAAGAGAGTAATGGTGAGTGAAG 13 94 
Qy 1373 atgcgt cggagatgagaaagcgagcgaagt at t ataaggaaatggcgaggcgggcggt tg 1432 

Ml II Mill MM II llllllll Mill II III III 

Db 13 95 AAGCAGATGGATTCAGAAACAGAGCTAAAGCGTATAAGGAGATGGCAAGAAAGGCTATTG 1454 

Qy 143 3 aggaaggcggtt cgt c t t at aatggt t tgaatgagatgatagaggat t tgagtgtgtacc 14 92 

I Mill II II Mill I III MM I II I II III I III II 
Db 1455 AAGAAGGAGGGTCATCTTACACTGGATTGACTACTTTGTTGGAAGATATAAGTACATATA 1514 

Qy 1493 gtgctccagaaaaacaagacttaaa 1517 

ii i i I I mi in 

Db 1515 GTTCCACTGGTCATTAAGTTATGAA 153 9 



RESULT 5 

AF346431 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



FEATURES 

source 



gene 



CDS 



BASE COUNT 
ORIGIN 



AF346431 142 8 bp mRNA PLN 02-APR-2 001 

Nicotiana tabacum phenylpropanoid : glucosyltransf erase 1 (togtl) 
mRNA, partial cds . 
AF346431 

AF34 6431.1 GI : 134 92673 

common tobacco. 
Nicotiana tabacum 

Eukaryota / Viridiplantae / Embryophyta ; Tracheophyta ; Spermatophyta ; 
Magnoliophyta; eudicotyledons ; core eudicots; Asteridae; euasterids 
I; Solanales; Solanaceae; Nicotiana. 

1 (bases 1 to 1428) 

Fraissinet-Tachet,L. , Baltz,R., Chong,J., Kauf f mann, S . , Fritig,B. 
and Saindrenan, P . 

Two tobacco genes induced by infection, elicitor and salicylic acid 
encode glucosyltransf erases acting on phenylpropanoids and benzoic 
acid derivatives, including s alicy lic acid 

99039922 
9824316 

2 (bases 1 to 1428) 

Fraissinet-Tachet , L . , Baltz,R., Chong,J., Fritig,B., Beffa,R. and 
Saindrenan, P . 
Direct Submission 

Submitted (05-FEB-2001) Phytopathologie Moleculaire, Institut de 
Biologie Moleculaire des Plantes du CNRS, 12 rue du general Zimmer, 
Strasbourg 67000, France 

Location/Qualifiers 

1. .1428 

/organism^ "Nicotiana tabacum" 

/cultivar="Samsun NN" 

/ db_xre f = " t axon : 4 0 9 7 » 

1. .>1428 

/gene="togtl" 

/note=" allele of Nicotiana tabacum IS5a encoded by GenBank 
Accession Number U23643" 
1. .>1428 
/gene="togtl n 

/not e=" glucosyltransf erase" 
/ codon_start=l 

/product- "phenylpropanoid : glucosyltransf erase 1 " 
/protein_id="AAK28303 .1" 
/db__xref="GI : 13492674" 

/ trans lation= "MGQLHFFFFPVMAHGHMIPTLDMAKLFASRGVKATIITTPLNEF 
VFSKAIQRNKHLGIEIEIRLIKFPAVENGLPEECERLDQIPSDEKLPNFFKAVAMMQE 
PLEQLIEECRPDCLISDMFLPWTTDTAAKFNIPRIVFHGTSFFALCVENSVRLNKPFK 
NVSSDSETFWPDLPHEIKLTRTQVSPFERSGEETAMTRMIKTVRESDSKSYGWFNS 
FYELETDYVEHYTKVLGRRAWAIGPLSMCNRDIEDKAERGKKSSIDKHECLKWLDSKK 
PSSWYVCFGSVANFTASQLHELAMGIEASGQEFIWWRTELDNEDWLPEGFEERTKE 
KGLIIRGWAPQVLILDHESVGAFVTHCGWNSTLEGVSGGVPMVTWPVFAEQFFNEKLV 
TEVLKTGAGVGSIQWKRSASEGVKREAIAKAIKRVMVSEEADGFRNRAKAYKEMARKA 
IEEGGSSYTGLTTLLEDISTYSSTGH" 
440 a 256 c 340 g 392 t 



Query Match 33.2%; Score 581.8; DB 13; Length 1428; 

Best Local Similarity 64.6%; Pred. No. '4.1e-127; 

Matches 931; Conservative 0; Mismatches 477; Indels 33; Gaps 



3; 



Qy 7 6 atgggaaaacttcacattgccttatttccagttatggctcatggtcacatgatcccaatg 13 5 

him i ii ii ii mi 1 1 1 m ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii i 

Db 1 ATGGGTCAGCTCCATTTTTTCTTCTTTCCTGTGATGGCTCATGGCCACATGATTCCTACA 60 

Qy 136 ttggacatggccaagctctttacctcaagaggcatacaaacaacaatcatttcgactc- - 193 

I MINI MINIMI I III I II I I I II II 1 1 Ml I 

Db 61 CTAGACATGGCGAAGCTCTTTGCTTCACGTGGTGTTAAGGCCACTATAATCACAACCCCA 12 0 

Qy 194 tcgccttcgctgatccgataaacaaagctcgtgattcgggcctcgatatt 243 

III III I I I II I I I III III MM II 

Db 121 CTCAATGAATTCGTTTTCTCCAAAGCTATTCAAAGAAACAAGCATTTGGGTATCGAAATC 180 

Qy 244 ggactaagcatcctcaaattcccaccagaaggatcaggaataccagatcacatggtgagc 3 03 

III I I II 1 1 II II I II I I II II MINI I I II 

Db 181 GAAATCCGTTTGATCAAATTCCCAGCTGTTGAAAACGGCTTACCTGAAGAATGCGAACGC 24 0 

Qy 3 04 cttgatctagt tactgaagattggctcccaaagtttgttgagtcattagtcttatta 360 

II MM II I I II II 1 M I i 1 1 1 1 III I I I III I I 

Db 241 CTCGATCAAATCCCTTCAGATGAGAAGCTCCCAAACTTTTTCAAAGCTGTAGCTATGATG 3 00 

Qy 361 caagagccagttgagaagcttatcgaagaactaaagctcgactgtctcgtttccgacatg 420 

Mill III I II I II M II MM I Ml Mill MM II Ml 

Db 3 01 CAAGAACCACTAGAACAGCTTATTGAAGAATGTCGCCCCGATTGTCTTATTTCAGATATG 3 60 

Qy 421 ttcttgccttggacagtcgattgtgcggctaagttcggtattccgaggttggttttccac 480 

III I llllllll III Ml II II II Mill II I II II II 

Db 3 61 TTCCTTCCTTGGACTACTGATACTGCAGCAAAATTTAACATTCCAAGAATAGTCTTTCAT 42 0 

Qy 481 ggaacgagcaactttgcgttgtgtgcttcggagcaaatgaagcttcacaagccttataag 540 

II II III Mill I MM III I I III I IMMM III 

Db 421 GGCACAAGCTTCTTTGCTCTTTGTGTTGAGAATAGCGTCAGGCTAAATAAGCCTTTCAAG 480 

Qy 541 aatgtaacttctgatactgagacatttgttataccggatttcccgcatgagctgaagttt 600 

Mill I II III MM II MM MINIUM II II II I III I 

Db 4 81 AATGTGTCCTCAGATTCTGAAACTTTTGTTGTACCGGATTTGCCTCACGAAATTAAGCTG 54 0 

Qy 601 gtgaggactcaagtggctccgtttcagcttgcggaaacggagaatggattctcaaagttg 660 

II II II III INIINI III I I INI I II III 

Db 541 ACCAGAACCCAGGTGTCTCCGTTTGAGCGATCTGGGGAAGAGACGGCTATGACCCGGATG 600 

Qy 661 atgaaacagatgacggagtctgttggtagaagctacggtgttgtggttaacagtttttat 72 0 

II III I I III II I I I INN II INN I llllllll III 

Db 601 ATAAAAACAGTCAGGGAATCAGATTCAAAGAGCTATGGAGTTGTTTTCAACAGTTTCTAT 660 

Qy 721 gagctcgagtcgacttatgtggattattacagagaggttttgggtagaaagtcttggaat 780 

INN II I MM II INI I INI IINIINI INN I 

Db 661 GAGCTTGAAACAGATTATGTTGAGCATTATACTAAGGTGCTGGGTAGAAGAGCTTGGGCT 72 0 

Qy 781 atagggcctctgttgttatccaacaatggcaatgaggaaaaagtacaaaggggaaaggaa 84 0 

II II Mill I I I I Mill I II III II MM MM MIMI II 

Db 721 ATTGGCCCTCTATCGATGTGCAACAGGGACATTGAAGATAAAGCTGAAAGAGGAAAGAAA 780 



Qy 841 tctgcgattggcgaacacgaatgcttggcttggttgaattccaagaagcagaattcggtt 900 

II I 1 1 1 1 Mill MINI Ml I 1 1 1 1 Mill I I III II 

Db 781 TCCTCTATTGATAAACACGAGTGCTTGAAATGGCTTGATTCGAAGAAACCAAGTTCCGTC 84 0 
Qy 901 gtttacgtttgttttggaagtatggcgacttttactccagcgcagttgcgcgaaactgcg 960 

! M i 1 1 1 ; 1 1 1 1 1 1 1 1 1 1 1 I MM Ml III II I II III MM III 

Db 841 GTTTACGTTTGTTTTGGAAGCGTAGCGAATTTCACTGCATCACAACTGCACGAACTTGCT 90 0 
Qy 961 attggactcgaggaatcaggccaagagttcatttgggtagttaaaaaggccaaaaacgaa 1020 

II III I II I II M Mill MINIMI MM M 

Db 901 ATGGGAATTGAAGCTTCCGGACAAGAATTCATTTGGGTTGTTAGAA 94 6 

Qy 1021 gaagaaggaaaaggaaaagaagaatggctgccagaaaattttgaggaaagagtgaaagat 1080 

II I II M Mill III MM Ml II I 1 1 1 ! 1 1 1 llllll 

Db 947 CAGAACTAGACAACGAAGATTGGTTGCCTGAAGGATTCGAGGAAAGAACGAAAGAG 1002 

Qy 1081 agaggcttgatcataagaggatgggcgccgcaattgttgatactcgatcatcctgcggta 1140 

I III II II III llllllll II Ml I I II II Mill I II 

Db 1003 AAAGGTTTAATAATAAGAGGATGGGCACCCCAAGTACTAATTCTTGATCACGAATCTGTG 1062 
Qy 1141 ggagctttcgtgacgcattgtggatggaattcgacgttggaaggaatatgcgccggtgtg 12 00 

II M II M II II llllllll IMIMM II I llllll I I I II II 

Db 1063 GGAGCTTTTGTTACACATTGTGGTTGGAATTCAACACTAGAAGGAGTTTCAGGAGGGGTT 1122 
Qy 12 01 cctatggtgacttggccagttttcgcagagcagtttttcaatgagaagtttgtgacagag 1260 

II Mill II Mill li M II Mill II II I II M I II 1 1 1 M Mill Ml 

Db 1123 CCAATGGTAACATGGCCTGTATTTGCTGAGCAATTTTTCAATGAGAAGTTAGTGACTGAG 1182 
Qy 12 61 gttttggggaccggtgtttcggttgggaataagaagtggctaagggcagcaagtgaaggt 132 0 

llllll llllll Mill I I Ml II MM llllllll 

Db 1183 GTTTTGAAAACTGGAGCTGGTGTTGGTTCGATACAATGGAAGAGATCAGCTAGTGAAGGA 1242 
Qy 1321 gtgtcgagggaggcagtgacgaacgcggtgcagcgtgttatggtgggagaaaatgcgtcg 13 80 

III llllllll I 1 1 1 1 I M 1 1 1 II II II I I II I II 

Db 1243 GTGAAAAGAGAAGCAATAGCTAAGGCAATAAAGAGAGTAATGGTGAGTGAAGAAGCAGAT 13 02 
Qy 13 81 gagatgagaaagcgagcgaagt at tataaggaaatggcgaggcgggcggt tgaggaaggc 144 0 

I llllll MM II llllllll Mill II III MM Mill 

Db 13 03 GGATTCAGAAACAGAGCTAAAGCGTATAAGGAGATGGCAAGAAAGGCTATTGAAGAAGGA 13 62 
Qy 1441 ggttcgtcttataatggtttgaatgagatgatagaggatttgagtgtgtaccgtgctcca 1500 

II II Mill I Ml MM I II I II III I Ml II II I I 

Db 13 63 GGGTCATCTTACACTGGATTGACTACTTTGTTGGAAGATATAAGTACATATAGTTCCACT 1422 

Qy 1501 g 1501 
I 

Db 1423 G 1423 



RESULT 1 
T03747 

glucosyltransf erase IS5a (EC 2.4.1.-), salicylate-induced - common tobacco 
C; Species: Nicotiana tabacum (common tobacco) 

C;Date: 24 -Mar- 1999 #sequence_revision 24-Mar-1999 #text_change 21-Jul-2000 

C;Accession: T03747 

R/Horvath, D.M.; Chua, N.H. 

Plant Mol. Biol. 31, 1061-1072, 1996 

A; Title: Identification of an immediate-early salicylic acid-inducible tobacco 
gene and characterization of induction by other compounds. 
A;Reference number: Z15050; MUID : 97000918 
A; Accession: T03747 

A; Status: translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A/Residues : 1-476 <HOR> 

A;Cross-references : EMBL :U32644 ; NID :gl685004 ; PIDN : AAB3 6653 . 1 ; PID:gl685005 
A; Experimental source: strain Bright Yellow 2 
C; Genetics : 
A; Gene: IS5a 

C; Superf amily : flavonol 03 -glucosyltransf erase 

C; Keywords : glycosyltransf erase ; hexosyltransf erase 



Query Match 61.9%; Score 1558.5/ DB 2; Length 476; 

Best Local Similarity 59.7%; Pred. No. 2.6e-112; 

Matches 2 85; Conservative 84; Mismatches 97; Indels 11; Gaps 

Qy 1 MGKLHIALFPVMAHGHMIPMLDMAKLFTSRGIQTTIIST LAFADPINKARDSGLDI 56 

Ihlll IIIIIIIMII -Ml |||:: MM |= I : = - 

Db 1 MGQLHIFFFPVMAHGHMIPTLDMAKLFASRGVKATIITTPLNEFVFSKAIQRNKHLGIEI 60 

Qy 57 GLSILKFPPEGSGIPDHMVSLDLV-TEDWLPKFVESLVLLQEPVEKLIEELKLDCLVSDM 115 

: -III =1 = 1= II = = = = II I = = = MMMMIII = MhMI 

Db 61 EIRLIKFPAVENGLPEECERLDQIPSDEKLPNFFKAVAMMQEPLEQLIEECRPDCLISDM 120 

Qy 116 FLPWTVDCAAKFGIPRLVFHGTSNFALCASEQMKLHKPYKNVTSDTETFVIPDFPHELKF 175 

I I I I I I MM 111 = 111111 MM : = I = I I = I I I = I I = I i I t : I t I t I = I 

Db 121 FLPWTTDTAAKFNIPRIVFHGTSFFALCVENSVRLNKPFKNVSSDSETFWPDLPHEIKL 180 



Qy 176 VRTQVAPFQLAETENGFSKLMKQMTESVGRSYGVWNSFYELESTYVDYYREVLGRKSWN 235 

Illhlh = I --I = II =11111 llllllh IhM MllhM 
Db 181 TRTQVSPFERSGEETAMTRMIKTVRESDSKSYGWFNSFYELETDYVEHYTKVLGRRAWA 240 

Qy 236 IGPLLLSNNGNEEKVQRGKESAIGEHECLAWLNSKKQNSWYVCFGSMATFTPAQLRETA 2 95 

MM = I hi Mlhhl Mill Ihlll MllhMMM II =11 I I 

Db 241 IGPLSMCNRDIEDKAERGKKSSIDKHECLKWLDSKKPSSWYICFGSVANFTASQLHELA 300 
Qy 296 IGLEESGQEFIWWKKAKNEEEGKGKEEWLPENFEERVKDRGLIIRGWAPQLLILDHPAV 355 

= 1 = 1 llllllllh = l = MM MM l = = llllllllll = lllll =1 

Db 301 MGVEASGQEFIWWRTELD NEDWLPEGFEERTKEKGLI IRGWAPQVLILDHESV 354 

Qy 356 GAFVTHCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGVSVGNKKWLRAASEG 415 

II I II 1 1 II II I II h MMMMMMMMM Mill II 11= =1 MM 

Db 355 GAFVTHCGWNSTLEGVSGGVPMVTWPVFAEQFFNEKLVTEVLKTGAGVGSIQWKRSASEG 414 



QY 



416 VSREAVTNAVQRVMVGENASEMRKRAKYYKEMARRAVEEGGSSYNGLNEMIEDLSVY 4 72 



Db 415 VKREAI AKAI KRVMVSEEADGFRNRAKAYKEMARKAI E EGGS SYTGLTTL LED I STY 471 



RESULT 2 
T03745 

glucosyltransf erase ISlOa (EC 2.4.1.-), salicylate-induced - common tobacco 
C; Species: Nicotiana tabacum (common tobacco) 

C;Date: 24 -Mar- 1999 #sequence_revision 24 -Mar- 1999 #text_change 21-Jul-2000 

C;Accession: T03745 

R;Horvath, D.M.; Chua, N.H. 

Plant Mol. Biol. 31, 1061-1072, 1996 

A; Title: Identification of an immediate-early salicylic acid-inducible tobacco 
gene and characterization of induction by other compounds. 
A/Reference number: Z15050; MUID : 97000918 
A; Access ion: T03745 

A; Status: translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A/Residues : 1-476 <HOR> 

A; Cross-references: EMBL:U32643; NID : gl685002 ; PIDN : AAB3 6652 . 1 ; PID:gl685003 
A; Experimental source: strain Bright Yellow 2 
C; Genetics : 
A; Gene: ISlOa 

C; Super family: flavonol 03 -glucosyltransf erase 

C; Keywords: glycosyltransf erase ; hexosyl trans f erase 

Query Match 60.3%; Score 1517.5; DB 2; Length 476; 

Best Local Similarity 58.5%; Pred. No. 3.7e-109; 

Matches 27 9; Conservative 85; Mismatches 102; Indels 11; Gaps 3; 
Qy 1 MGKLH I AL F P VMAHGHM I PMLDMAKL FTSRG I QTT I I S T LAFADP INKARDSGLD I 56 

MMI MINIM U MM III:: MM h I = = MM 

Db 1 MGQLHFFFFPVMAHGHMIPTLDMAKLVASRGVKATIITTPLNESVFSKSIQRNKHLGIEI 60 

Qy 57 GLSILKFPPEGSGIPDHMVSLDLV-TEDWLPKFVESLVLLQEPVEKLIEELKLDCLVSDM 115 

: -III =h|: 111= -I II I -IMMMMI = MIIMI 

Db 61 EIRLIKFPAVENGLPEECERLDLIPSDDKLPNFFKAVAMMQEPLEQLIEECRPNCLVSDM 120 

Qy 116 FLPWTVDCAAKFGIPRLVFHGTSNFALCASEQMKLHKPYKNVTSDTETFVIPDFPHELKF 175 

Mill I I I I I :||:|IIMI I I I I ::|:||:M|:||:|ll|:h Mhl 

Db 121 FLPWTTDTAAKFNMPRIVFHGTSFFALCVENSIRLNKPFKNVSSDSETFWPNLPHEIKL 180 

Qy 176 VRTQVAPFQLAETENGFSKLMKQMTESVGRSYGVWNSFYELESTYVDYYREVLGRKSWN 235 

IMMM : | ::::| : || MMM III III IhM MMMM 
Db 181 TRTQLSPFEQSGEETTMTRMIKSVRESDSKSYGVIFNSFNELEHDYVEHYTKVLGRRAWA 240 

Qy 2 3 6 IGPLLLSNNGNEEKVQRGKESAIGEHECLAWLNSKKQNSWYVCFGSMATFTPAQLRETA 2 95 

I I I I : I hi Ml,: M MM MM : M I t I i I M = I II Ml I I 
Db 241 IGPLSMCNRDIEDKAERGKQSSIDKHECLKWLDSKKPSSWYVCFGSVANFTASQLHELA 300 

Qy 296 IGLEESGQEFIWWKKAKNEEEGKGKEEWLPENFEERVKDRGLIIRGWAPQLLILDHPAV 355 

MM IMMM : : MM III M M 1 1 1 1 1 1 1 II M 1 1 1 M 

Db 301 MGIEASGQEFIWWRTELD NEDWLPEGLEERTKEKGLIIRGWAPQVLILDHESV 354 

Qy 3 56 GAFVTHCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGVSVGNKKWLRAASEG 415 

IIIIIIIMIIMII: Ml -'Ml || Mill II M M IMIM 

Db 3 55 GAFVTHCGWNSTLEGVSGGVPMVTWPVFAEQFFNEKLVTEVLKTGAGVGSIQWKRSASEG 414 



416 VSREAVTNAVQRVMVGENASEMRKRAKYYKEMARRAVEEGGSSYNGLNEMIEDLSVY 4 72 

I llh |::||ll II I III 111111 = 1 = 1 Mill II -Ihl 
415 VKREAIAKAIKRVMVSEEAEGFRNRAKAYKEMARKAIEGGGSSYTGLTTLLEDISTY 471 



RESULT 1 
Q9SXF2 

ID Q9SXF2 PRELIMINARY; PRT; 476 AA. 

AC Q9SXF2 ; 

DT 01-MAY-2000 (TrEMBLrel . 13, Created) 

DT 01-MAY-2000 (TrEMBLrel . 13 , Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE UDP-GLUCOSE: FLAVONOID 7 -O-GLUCOSYLTRANSFERASE . 

GN UFGT . 

OS Scutellaria baicalensis. 

OC Eukaryota; Viridiplantae ; Embryophyta ; Tracheophyta; Spermatophyta; 

OC Magnoliophyta; eudicotyledons ; core eudicots; Asteridae; euasterids I 

OC Lamiales; Lamiaceae; Scutellaria. 

OX NCBI_TaxID=65409; 

RM [1] 

RP SEQUENCE FROM N . A. 

RC TISSUE=HAIRY ROOT; 

RA Hirotani M., Suzuki H. , Yoshikawa T.; 

RT "Cloning and expression of UDP-glucose: Flavonoid 7-0- 

RT glucosyltransf erase from hairy root cultures of Scutellaria 

RT baicalensis . " ; 

RL Submitted (AUG-1999) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AB031274; BAA83484.1; -. 

DR InterPro; IPR002213; -. 

DR Pfam; PF0 02 01; UDPGT; 1. 

DR PROSITE; PS00375; UDPGT; 1. 

KW Transferase. 

SQ SEQUENCE 476 AA; 53095 MW; 0A2 8053 59B5EDE2A CRC64 ; 



Query Match 69.5%; Score 1749.5; DB 10; Length 476; 

Best Local Similarity 68.7%; Pred. No. l.le-124; 

Matches 3 32; Conservative 63; Mismatches 79; Indels 9; Gaps 

Qy 1 MGKLHIALFPVMAHGHMIPMLDMAKLFTSRGIQTTIISTLAFADPINKARDSGLDIGLSI 60 

Ml | |::|||||||||||||||:|!|::||||:| Mhll Mhll Mlh 
Db 1 MGQLHIVLVPMIAHGHMIPMLDMAKLFSSRGVKTTIIATPAFAEPIRKARESGHDIGLTT 60 

Qy 61 LKFPPEGSGIPDHMVSLDLVTEDWLPKFVESLVLLQEPVEKLIEELKLDCLVSDMFLPWT 12 0 

Nihil :|h: III Ihl II I H IIIIMhMMI llllllllllll 
Db 61 TKFPPKGSSLPDNIRSLDQVTDDLLPHFFRALELLQEPVEEIMEDLKPDCLVSDMFLPWT 12 0 

Qy 121 VDCAAKFGI PRLVFHGTSNFALCASEQMKLHKPYKNVTSDTETFVI PDFPHELKFVRTQV 180 

I MINIMUM II I M : 111111 = 11 = 1 Ih llh Mlh 

Db 121 TDSAAKFGIPRLLFHGTSLFARCFAEQMSIQKPYKNVSSDSEPFVLRGLPHEVSFVRTQI 180 

Qy 181 APFQLAE-TENGFSKLMKQMTESVGRSYGVWNSFYELESTYVDYYREVLGRKSWNIGPL 239 

M :: |||: ||| :: M M I M I I II II II M I : M M I I 

Db 181 PDYELQEGGDDAFSKMAKQMRDADKKSYGDVINSFEELESEYADYNKNVFGKKAWHIGPL 24 0 

Qy 24 0 LLSNNGNEEK-VQRGKESAIGEHECLAWLNSKKQNSWYVCFGSMATFTPAQLRETAIGL 2 98 

Ml hi 1 1 MM I Ml II Mill III 1 1 1 Ml 1 1 1 1 1 II I II M Mhll 

Db 241 KLFNNRAEQKSSQRGKESAIDDHECLAWLNSKKPNSWYMCFGSMATFTPAQLHETAVGL 3 00 
Qy 2 99 EESGQEFIWWKKAKNEEEGKGKEEWLPENFEERVKDRGLIIRGWAPQLLILDHPAVGAF 3 58 

I IMIIMM I hllh MM M I M I M II MM 1 1 , 1 : Ml 

Db 301 ESSGQDFIWWR NGGENEDWLPQGFEERIKGKGLMIRGWAPQVMILDHPSTGAF 354 



Qy 359 VTHCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGVSVGNKKWLRAASEGVSR 418 

IIIIIMIIMIIIIhllllllllMlhlll Mill MINIMI | Ml 

Db 355 VTHCGWNSTLEGICAGLPMVTWPVFAEQFYNEKLVTEVLKTGVSVGNKKWQR-VGEGVGS 413 

Qy 419 EAVTNAVQRVMVGENASEMRKRAKYYKEMARRAVEEGGSSYNGLNEMIEDLSVYRAPEKQ 47 8 

III 11 = 11111= hill II llllllhllllllllll II Mhll I I II 
Db 414 EAVKEAVERVMVGDGAAEMRSRALYYKEMARKAVEEGGSSYNNLNALIEELSAYVPPMKQ 473 

Qy 479 DLN 481 
II 

Db 474 GLN 476 



RESULT 2 
P93365 

ID P93365 PRELIMINARY; PRT; 476 AA. 

AC P93365; 

DT 01-MAY-1997 (TrEMBLrel. 03, Created) 

DT 01-MAY-1997 (TrEMBLrel . 03 , Last sequence update) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 

DE IMMEDIATE-EARLY SALICYLATE- INDUCED GLUCOSYLTRANSFERASE . 

GN IS5A. 

OS Nicotiana tabacum (Common tobacco) . 

OC Eukaryota; Viridiplantae ; Embryophyta; Tracheophyta; Spermatophyta; 

OC Magnoliophyta; eudicotyledons ; core eudicots; Asteridae; euasterids I; 

OC Solanales; Solanaceae; Nicotiana. 

OX NCBI_TaxID=4097; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=BRIGHT YELLOW 2; 

RX MEDLINE=97000918; PubMed=8 84 3 94 8 ; 

RA Horvath D.M., Chua N.H. ; 

RT "Identification of an immediate-early salicylic acid-inducible tobacco 

RT gene and characterization of induction by other compounds."; 

RL Plant Mol. Biol. 31:1061-1072(1996). 

DR EMBL; U32644; AAB36653.1; -. 

DR Mendel; 9420; Nicta ; 2542 ; 9420 . 

DR InterPro; IPR002213 ; - . 

DR Pfam; PF00201; UDPGT; 1. 

DR PROSITE; PS00375; UDPGT; 1. 

KW Transferase . 

SQ SEQUENCE 476 AA; 53614 MW; 7C8FD61CEA853F67 CRC64 ; 



Query Match 61.9%; Score 1558.5; DB 10; Length 476; 

Best Local Similarity 59.7%; Pred. No. 3.6e-110; 

Matches 2 85; Conservative 84; Mismatches 97.; Indels 11; Gaps 3; 

Qy 1 MGKLHIALFPVMAHGHMIPMLDMAKLFTSRGIQTTIIST LAFADPINKARDSGLDI 56 

Ihlll MIMIIMM Mill Mh: Mhl h I : : - 

Db 1 MGQLH I F F F P VMAHGHM I PTLDMAKL FAS RGVKAT I I TTPLNE F VF S KAI QRNKHLG I E I 60 

Qy 57 GLSILKFPPEGSGIPDHMVSLDLV-TEDWLPKFVESLVLLQEPVEKLIEELKLDCLVSDM 115 

: ::||| :|:|: || : ::: || | ::: ::|||:|:|||| = Mhlll 

Db 61 EIRLIKFPAVENGLPEECERLDQIPSDEKLPNFFKAVAMMQEPLEQLIEECRPDCLISDM 120 



Qy 116 FLPWTVDCAAKFGIPRLVFHGTSNFALCASEQMKLHKPYKNVTSDTETFVIPDFPHELKF 175 

Mill I I I I I llhllllll I I I I ::| :|| : , | ; : M : I I I hi 111 = 1 

Db 121 FLPWTTDTAAKFNI PRI VFHGTS FFALCVENS VRLNKPFKNVS SDSETFWPDLPHE I KL 180 



Qy 176 VRTQVAPFQLAETENGFSKLMKQMTESVGRSYGVWNSFYELESTYVDYYREVLGRKSWN 235 

lllhlh : I ■■■■■■■■\ : II hllll llllllh Ihh hllhh 
Db 181 TRTQVSPFERSGEETAMTRMIKTVRESDSKSYGWFNSFYELETDYVEHYTKVLGRRAWA 24 0 

Qy 236 IGPLLLSNNGNEEKVQRGKESAIGEHECLAWLNSKKQNSWYVCFGSMATFTPAQLRETA 2 95 

I I I I : I hi : I I I = I = i hill Ihlll hllhlllhl II hi I I 
Db 241 IGPLSMCNRDIEDKAERGKKSSIDKHECLKWLDSKKPSSWYICFGSVANFTASQLHELA 300 

Qy 296 IGLEESGQEFIWWKKAKNEEEGKGKEEWLPENFEERVKDRGLIIRGWAPQLLILDHPAV 355 

hh Ihlllllh = Ihlll llll hhl llllllhllll 

Db 301 MGVEASGQEFIWWRTELD NEDWLPEGFEERTKEKGLI IRGWAPQVLILDHESV 354 

Qy 3 56 GAFVTHCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGVSVGNKKWLRAASEG 415 

: I I I I I I I I I I I I h I I I I I I I I I I I I I h hill || Ih h Ihlll 
Db 3 55 GAFVTHCGWNSTLEGVSGGVP1VIVTWPVFAEQFFNEKLVTEVLKTGAGVGS IQWKRSASEG 4 14 

Qy 416 VSREAVTNAVQRVMVGENASEMRKRAKYYKEMARRAVEEGGSSYNGLNEMIEDLSVY 472 

I I I h hhlh I I I III hhhhhlhhl II -I hi 
Db 415 VKREAIAKAIKRVMVSEEADGFRNRAKAYKEMARKAIEEGGSSYTGLTTLLEDISTY 471 



RESULT 3 
P93364 

ID P933 64 PRELIMINARY ; PRT; 4 76 AA. 

AC P93364; 

DT 01-MAY-1997 (TrEMBLrel . 03, Created) 

DT 01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 

DE IMMEDIATE-EARLY SALICYLATE -INDUCED GLUCOSYLTRANSFERASE . 

GN IS10A. 

OS Nicotiana tabacum (Common tobacco) . 

OC Eukaryota; Viridiplantae ; Embryophyta; Tracheophyta; Spermatophyta; 

OC Magnoliophyta; eudicotyledons ; core eudicots; Asteridae; euasterids I; 

OC Solanales; Solanaceae; Nicotiana. 

OX NCBI_TaxID=4 0 97 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BRIGHT YELLOW 2; 

RX MEDLINE=97000918; PubMed=8843 94 8 ; 

RA Horvath D.M., Chua N.H. ; 

RT "Identification of an immediate-early salicylic acid-inducible tobacco 

RT gene and characterization of induction by other compounds."; 

RL Plant Mol. Biol. 31:1061-1072(1996). 

DR EMBL; U32643; AAB36652.1; 

DR Mendel; 9419; Nicta; 2542 ; 9419 . 

DR InterPro; IPR002213; 

DR Pfam; PF00201; UDPGT; 1. 

DR PROSITE; PS00375; UDPGT; 1. 

KW Transferase. 

SQ SEQUENCE 476 AA; 53454 MW; 7 61A43 83 7A17A232 CRC64 ; 



Query Match 



60.3%; Score 1517.5; DB 10; Length 476; 



Best Local Similarity 58.5%; Pred. No. 4.6e-107; 

Matches 279; Conservative 85; Mismatches 102; Indels 11; Gaps 



Qy 1 MGKLHIALFPVMAHGHMIPMLDMAKLFTSRGIQTTIIST LAFADPINKARDSGLDI 56 

11 = 11 MINIMI MM llh: MM h I = = MM 

Db 1 MGQLHFFFFPVMAHGHMIPTLDMAKLVASRGVKATIITTPLNESVFSKSIQRNKHLGIEI 60 

Qy 57 GLSILKFPPEGSGIPDHMVSLDLV-TEDWLPKFVESLVLLQEPVEKLIEELKLDCLVSDM 115 

= -III = hh llh = = l II I = = = = = 111 = 1 = 1111 = =111111 

Db 61 EIRLIKFPAVENGLPEECERLDLIPSDDKLPNFFKAVAMMQEPLEQLIEECRPNCLVSDM 12 0 

Qy 116 FLPWTVDCAAKFGIPRLVFHGTSNFALCASEQMKLHKPYKNVTSDTETFVIPDFPHELKF 175 

lllll I I I I I :|hllllll I I I I ::|:||:|||:||:||||:|: |||:| 

Db 121 FLPWTTDTAAKFNMPRIVFHGTSFFALCVENSIRLNKPFKNVSSDSETFWPNLPHEIKL 180 

Qy 17 6 VRTQVAPFQLAETENGFSKLMKQMTESVGRSYGVWNSFYELESTYVDYYREVLGRKSWN 235 

lll-lh = I ::::| = II Hllh III III I I = = I HIII-I 
Db 181 TRTQLSPFEQSGEETTMTRMIKSVRESDSKSYGVIFNSFNELEHDYVEHYTKVLGRRAWA 24 0 

Qy 23 6 IGPLLLSNNGNEEKVQRGKESAIGEHECLAWLNSKKQNSWYVCFGSMATFTPAQLRETA 2 95 

I I I I I hi :|lhhl : I I II MM :|||||||||:| II :|| I I 
Db 241 IGPLSMCNRD I EDKAERGKQS S IDKHECLKWLDS KKPS S VVYVCFGS VANFTASQLHELA 300 

Qy 2 96 IGLEESGQEFIWWKKAKNEEEGKGKEEWLPENFEERVKDRGLIIRGWAPQLLILDHPAV 355 

= hl IIIIIMM : MM III MMI MIMMMIM =1 

Db 301 MG I EASGQE F I W WRTELD NEDWLPEGLEERTKEKGLIIRGWAPQVLILDHESV 354 

Qy 3 56 GAFVTHCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGVSVGNKKWLRAASEG 415 

l lllll; I ! I llllh 1 1 1 1 1 1 ! 1 1 ! : i : 1 1 1 1 MM ii im m imiii 

Db 355 GAFVTHCGWNSTLEGVSGGVPMVTWPVFAEQFFNEKLVTEVLKTGAGVGSIQWKRSASEG 414 
Qy 416 VSREAVTNAVQRVMVGENASEMRKRAKYYKEMARRAVEEGGSSYNGLNEMIEDLSVY 472 

I llh MMM II I Ml 1 1 i 1 1 1 = I = I lllll II = = lhl I 

Db 415 VKREAIAKAIKRVMVSEEAEGFRNRAKAYKEMARKAIEGGGSSYTGLTTLLEDISTY 471 

RESULT 4 
Q43526 

ID Q4 3 52 6 PRELIMINARY; PRT; 4 66 AA. 

AC Q43526; 

DT 01-NOV-1996 (TrEMBLrel . 01, Created) 

DT 01-NOV-1996 (TrEMBLrel. 01, Last sequence update) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 

DE TWI1 (FRAGMENT) . 

GN TWI1. 

OS Lycopersicon esculentum (Tomato) . 

OC Eukaryota; Viridiplantae ; Embryophyta; Tracheophyta; Spermatophyta ; 
OC Magnoliophyta; eudicotyledons ; core eudicots; Asteridae; euasterids I 
OC Solanales; Solanaceae; Solanum. 
OX NCB I_TaxID=4 081; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC ST RAIN=CV . MONE Y M AKER ; TIS SUE = WOUNDED LEAF; 

RA gggggg^^^^^^^erty H.M., Loake G.J. , Mcpherson M.J., 



DR Mendel; 8950; Lyces ; 2542 ; 8950 . 

DR InterPro; IPR002213; 

DR Pfam; PF00201; UDPGT; 1. 

DR PROSITE; PS00375; UDPGT; 1. 

FT NONJTER 1 1 

SQ SEQUENCE 466 AA; 52457 MW; 293DFFBE898AC8A7 CRC64 ; 

Query Match 58.6%; Score 1474.5; DB 10; Length 466; 

Best Local Similarity 57.2%; Pred. No. 8.2e-104; 

Matches 271; Conservative 83; Mismatches 107; Indels 13; Gaps 
Qy 5 HI ALFPVMAHGHMI PMLDMAKLFTSRGIQTTI 1ST LAFADPINKARDSGLDIGLSI 60 



Db 1 HFFFFPDDAQGHMIPTLDMANWACRGVKATIITTPLNESVFSKAIERNKHLGIEIDIRL. 60 

Qy 61 LKFPPEGSGIPDHMVSLDLV-TEDWLPKFVESLVLLQEPVEKLIEELKLDCLVSDMFLPW 119 

Db 61 LKFPAKENDLPEDCERLDLVPSDDKLPNFLKAAAMMKDEFEELIGECRPDCLVSDMFLPW 12 0 

Qy 12 0 TVDCAAKFGIPRLVFHGTSNFALCASEQMKLHKPYKNVTSDTETFVIPDFPHELKFVRTQ 179 

I I I I I I MhllllM I I I I = : I I = I I I = I I I I I I I = M llh: III 

Db 121 TTDSAAKFSIPRIVFHGTSYFALCVGDTIRRNKPFKNVSSDTETFWPDLPHEIRLTRTQ 180 

Qy 180 VAPFQLAETENGFSKLMKQMTESVGRSYGWVNSFYELESTYVDYYREVLGRKSWNIGPL 239 

= Mh := | | : ::| : || :||||: |||||||| ll = = l =hllhl MM 
Db 181 LSPFEQSDEETGMAPMIKAVRESDAKSYGVIFNSFYELESDYVEHYTKWGRKNWAIGPL 240 

Qy 240 LLSNNGNEEKVQRGKESAIGEHECLAWLNSKKQNSWYVCFGSMATFTPAQLRETAIGLE 2 99 

I I hi MhMM II II Ihlll MMIIIMI I II IhM hill 
Db 241 SLCNRDIEDKAERGRKSSIDEHACLKWLDSKKSSSIVYVCFGSTADFTTAQMQELAMGLE 3 00 

Qy 3 00 ESGQEFIWWKKAKNEEEGKGKEEWLPENFEERVKDRGLIIRGWAPQLLILDHPAVGAFV 359 

MIMIM- I hill MM hMMMIMM Mill hill 

Db 3 01 ASGQDFIWVIR TGNEDWLPEGFEERTKEKGLIIRGWAPQSVILDHEAIGAFV 3 52 

Qy 360 THCGWNSTLEGICAGVPMVTWPVFAEQFFNEKFVTEVLGTGVSVGNKKWLRAASEGVSRE 419 

II II II II II 1 1 M 1 1 II 1 1 1 II II I II II I II I h M I h h I I 1 1 1 II II 

Db 3 53 THCGWNSTLEGISAGVPMVTWPVFAEQFFNEKLVTEVMRSGAGVGSKQWKRTASEGVKRE 412 
Qy 420 AVTNAVQRVMVGENASEMRKRAKYYKEMARRAVEEGGSSYNGLNEMIEDLSVYR 473 

h hhl I I III MUM hlllllllll MM:: || 

Db 413 AIAKAIKRVMASEETEGFRSRAKEYKEMAREAIEEGGSSYNGWATLIQDITSYR 466 
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LOCUS (LOC) : 
GenBank ACC. NO. (GBN) 
CAS REGISTRY NO. (RN) : 
SEQUENCE LENGTH (SQL) : 
MOLECULE TYPE (CI) : 
DIVISION CODE (CI) : 
DATE (DATE) : 
DEFINITION (DEF) : 

SOURCE : 
ORGANISM (ORGN) : 



NUCLEIC ACID COUNT (NA) 
REFERENCE : 

AUTHOR (AU) : 

TITLE (TI) : 

JOURNAL (SO) : 
REFERENCE : 

AUTHOR (AU) : 
TITLE (TI) : 
JOURNAL (SO) : 



AF117267 GenBank (R) 

AF117267 

225568-86-5 

1819 

mRNA; linear 

Plants , fungi , algae 

20 Apr 1999 

Malus domestica UDP glucose : flavonoid 3-O-glucosyl 
transferase (UFGT1) mRNA, complete cds . 
apple tree. 
Malus x domestica 

Eukaryot a ; Viridiplant ae ; S t reptophyta ; Embryophyt a ; 
Tracheophy ta ; Spermatophyt a ; Magnolibphyt a ; 
eudicotyledons ; core eudicots; Rosidae; eurosids I; 
Rosales; Rosaceae; Malus 
: 488 a 446 c 455 g 430 t 

1 (bases 1 to 1819) 

Lee,J.-R.; Hong,S.-T.; Yoo,Y.G.; Kim,S.-R. 
Molecular cloning and expression of anthocyanin 
biosynthesis genes from 'Fuji apple' 
Unpublished 

2 (bases 1 to 1819) 

Lee,J.-R.; Hong,S.-T.; Yoo,Y.G.; Kim,S.-R. 
Direct Submission 

Submitted (31-DEC-1998) Life Science, Sogang 
University, 1 Shinsoo-Dong, Mapo-Gu, Seoul 121-742, 
South Korea 



FEATURES (FEAT) 
Feature Key 



Location 
1. . 1819 



1. .1819 
72 . . 1523 



Qualifier 



source 



gene 
CDS 



/organism^ "Malus x domestica" 

/ cultivar= " Fuj i " 

/db-xref =" taxon : 3750 " 

/ t issue- type= "peel" 

/note= "Malus domestica Borkh" 

/gene="UFGTl" 

/gene="UFGTl" 

/codon-start=l 

/product = "UDP glucose : flavonoid 

3-O-glucosyl transferase" 

/protein-id="AAD26203 . 1" 

/db-xref ="GI :4588779" 

/ trans lation="MAAPLPIEIEPSSTNGQPHL 

ADAYNRHVAWAF P FTSHAS ALLE 

TVRRLATALPNTLFSFFSTSKSNSSLFSNNSIDN 

MPRNIRVYDVADGVPEGYVFVGKP 

QEDIELFMNAAPENIRRSLDASVADIGKQISCLI 

TDAFLWFGVHLADELGVPWVTFWI 

SGLKSLSVHVHTDLIRDTIGTQGITGRENDLIVD 

KNVNI QGLSNVRI KDLAEGVI FGN 

LDSVISGMLLQMGRLLPRATAVFMNGFEELELPI 

PNDLKSKVNKLLNVGPSNVASPLP 

PLPPSDACLS WLDKQQAPS S WY I S FGTVAS PAE 

KEQMAIAEALEATGAPFLWSIKDS 

CKTPLLNEFLTKTLSKLNGMWPWAPQPHVLAHD 

SVGAFVSHCGWNSIMETIAGRVPM 



ICRPYFADQRLNARMVEEVFEIGVTVEDGVFTRE 
GLVKS L EWL S PE S GRKFRDN I KR 
VKQLAVEAVGPQGSSTRNFKSLLDIVSGSNYQV" 



SEQUENCE (SEQ) : 

1 gctaactcca 
61 tgtaagctgt 
121 aaccccatct 
181 gccatgcaag 
241 tcttctcgtt 
301- ataacatgcc 
361 ttttcgtggg 
421 tccggaggag 
481 ccgacgcctt 
541 ctttctggat 
601 acactattgg 
661 ttaacatcca 
721 gaaacttgga 
781 ccaccgcagt 
841 agtccaaagt 
901 cgctgccgcc 
961 tcgtgtacat 
1021 cggaggccct 
1081 caccgttgct 
1141 cgtgggctcc 
1201 gcggctggaa 
1261 attttgcaga 
1321 ccgtggagga 
1381 cgcctgaaag 
1441 aggcggttgg 
1501 caggatccaa 
1561 ccagctgcaa 
1621 agcggacata 
1681 ataagcggtt 
1741 acgtttttgt 
1801 taaaaaaaaa 



ttattccatc 
aatggcagcg 
cgccgacgcc 
cgccttgctt 
cttcagcact 
gcgtaacata 
caagccgcag 
cttagacgct 
cctttggttt 
ctccggactc 
aactcaaggc 
aggtctctcc 
ctcggtaatt 
tttcatgaac 
caacaaactc 
atcagatgct 
aagcttcggg 
ggaagccacc 
gaacgagttc 
acagccgcat 
ctcgataatg 
ccagaggctt 
tggagttttt 
tgggaggaaa 
accacaaggg 
ttatcaagta 
tagctgttcg 
tttagggcgg 
gtgctgtgaa 
aaaagtgctt 
aaaaaaaaa 



agtactgcta 
ccgctgccca 
tacaaccgtc 
gagaccgtgc 
tcaaaatcca 
agggtgtacg 
gaggacatag 
tccgtggcgg 
ggagtccact 
aaatccctct 
attacaggtc 
aatgtacgaa 
tccggcatgc 
ggcttcgaag 
ctcaacgtag 
tgcttgtcat 
acagtggcga 
ggagcaccct 
ttgacaaaaa 
gtactggccc 
gagactatag 
aatgcaagga 
accagggagg 
ttcagagaca 
agctccactc 
tagtacgggg 
ttgctcaata 
gtttggtatt 
aataagcggc 
ttggaaagaa 



cttctattca actccttttc 



tcgaaatcga 
acgtggctgt 
gccgcctagc 
acagctctct 
atgtggctga 
agctcttcat 
acatcgggaa 
tggctgacga 
ccgttcatgt 
gtgaaaacga 
tcaaagactt 
tacttcagat 
aattggaact 
gaccttccaa 
ggctagacaa 
gcccagcgga 
tcttgtggtc 
cattgtcaaa 
acgattcggt 
caggacgggt 
tggtggagga 
ggctggtaaa 
atataaagag 
ggaacttcaa 
accataaata 
ataatgtagc 
gatgtgcttt 
tgtgaaataa 
aaaaaacagt 



accatcatca 
cgtagccttc 
caccgccctt 
cttttccaac 
C9999tgccg 
gaatgccgca 
gcagatcagc 
gttgggagtg 
gcatactgat 
cctcatcgtc 
agcggaagga 
gggacggctc 
ccccatacca 
cgtagcatcc 
gcaacaggct 
gaaggagcag 
tatcaaggac 
gctgaacggg 
cggagccttc 
gcccatgatt 
ggtgtttgag 
aagcttggaa 
ggtcaaacaa 
atcgctgttg 
tgtagcacta 
acgctacgat 
gaaaaaaaga 
atcagcagag 
ctaatagtgt 



taattagcct 
actaatggtc 
cctttcacta 
ccaaacactc 
aacagcattg 
gaggggtacg 
ccggaaaaca 
tgcttgatca 
ccttgggtca 
ctcatccgcg 
gacaaaaatg 
gtcattttcg 
ctcccccgtg 
aacgacctaa 
ccgctgccac 
ccatcctccg 
atggcaatag 
agctgcaaga 
atggtggtgc 
gtgtcgcatt 
tgtaggccat 
atcggggtaa 
gtggttttgt 
ctggcagtag 
gacatcgtat 
aaaatacaca 
ctaccctacg 
tgctgtgaga 
tgtttgataa 
gtcttttcat 



